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Preface 


The  last  twenty  years  of  the  last  millennium  are  characterized  by  complex  automati- 
zation of  industrial  plants.  Complex  automatization  of  industrial  plants  means  a 
switch  to  factories,  automatons,  robots  and  self  adaptive  optimization  systems.  The 
mentioned  processes  can  be  intensified  by  introducing  mathematical  methods  into 
all  physical  and  chemical  processes.  By  being  acquainted  with  the  mathematical 
model  of  a process  it  is  possible  to  control  it,  maintain  it  at  an  optimal  level,  provide 
maximal  yield  of  the  product,  and  obtain  the  product  at  a minimal  cost.  Statistical 
methods  in  mathematical  modeling  of  a process  should  not  be  opposed  to  tradi- 
tional theoretical  methods  of  complete  theoretical  studies  of  a phenomenon.  The 
higher  the  theoretical  level  of  knowledge  the  more  efficient  is  the  application  of  sta- 
tistical methods  like  design  of  experiment  (DOE). 

To  design  an  experiment  means  to  choose  the  optimal  experiment  design  to  be 
used  simultaneously  for  varying  all  the  analyzed  factors.  By  designing  an  experi- 
ment one  gets  more  precise  data  and  more  complete  information  on  a studied  phe- 
nomenon with  a minimal  number  of  experiments  and  the  lowest  possible  material 
costs.  The  development  of  statistical  methods  for  data  analysis,  combined  with  de- 
velopment of  computers,  has  revolutionized  the  research  and  development  work  in 
all  domains  of  human  activities. 

Due  to  the  fact  that  statistical  methods  are  abstract  and  insufficiently  known  to  all 
researchers,  the  first  chapter  offers  the  basics  of  statistical  analysis  with  actual  exam- 
ples, physical  interpretations  and  solutions  to  problems.  Basic  probability  distribu- 
tions with  statistical  estimations  and  with  testings  of  null  hypotheses  are  demon- 
strated. A detailed  analysis  of  variance  (ANOVA)  has  been  done  for  screening  of  fac- 
tors according  to  the  significances  of  their  effects  on  system  responses.  For  statisti- 
cal modeling  of  significant  factors  by  linear  and  nonlinear  regressions  a sufficient 
time  has  been  dedicated  to  regression  analysis. 

Introduction  to  design  of  experiments  (DOE)  offers  an  original  comparison  be- 
tween so-called  classical  experimental  design  (one  factor  at  a time-OFAT)  and  statis- 
tically designed  experiments  (DOE).  Depending  on  the  research  objective  and  sub- 
ject, screening  experiments  (preliminary  ranking  of  the  factors,  method  of  random 
balance,  completely  randomized  block  design,  Latin  squares,  Graeco-Latin  squares, 
Youdens  squares)  then  basic  experiments  (full  factorial  experiments,  fractional  fac- 
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torial  experiments)  and  designs  of  second  order  (rotatable,  D-optimality,  orthogonal, 
B-designs,  Hartleys  designs)  have  been  analyzed. 

For  studies  with  objectives  of  reaching  optima,  of  particular  importance  are  the 
chapters  dealing  with  experimental  attaining  of  an  optimum  by  the  gradient  method 
of  steepest  ascent  and  the  nongradient  simplex  method.  In  the  optimum  zone  up  to 
the  response  surface,  i.e.  response  function,  one  can  reach  it  by  applying  second- 
order  designs.  By  elaborating  results  of  second-order  design  one  can  obtain  square 
regression  models  the  analysis  of  which  is  shown  in  the  chapter  on  canonical  analy- 
sis of  the  response  surface. 

The  third  section  of  the  book  has  been  dedicated  to  studies  in  the  mixture  design 
field.  The  methodology  of  approaching  studies  has  been  kept  in  this  field  too.  One 
begins  with  screening  experiments  (simplex  lattice  screening  designs,  extreme  ver- 
tices designs  of  mixture  experiments  as  screening  designs)  through  simplex  lattice 
design,  Scheffe's  simplex  lattice  design,  simplex  centroid  design,  extreme  vertices 
design,  D-optimal  design,  Draper-Lawrence  design,  full  factorial  mixture  design, 
and  one  ends  with  factorial  designs  of  process  factors  that  are  combined  with  mix- 
ture design  so-called  "crossed"  designs. 

The  significance  of  mixture  design  for  developing  new  materials  should  be  partic- 
ularly stressed.  The  book  is  meant  for  all  experts  who  are  engaged  in  research,  devel- 
opment and  process  control. 

Apart  from  theoretical  bases,  the  book  contains  a large  number  of  practical  exam- 
ples and  problems  with  solutions.  This  book  has  come  into  being  as  a product  of 
many  years  of  research  activities  in  the  Military  Technical  Institute  in  Belgrade.  The 
author  is  especially  pleased  to  offer  his  gratitude  to  Prof.  Dragoljub  V.  Vukovic, 
Ph.D.,  Branislav  Djulcic,  M.Sc.  and  Paratha  Sarathy,  B.Sc.  For  technical  editing  of 
the  manuscript  I express  my  special  gratitude  to  Predrag  Jovanic,  Ph.D.,  Drago 
Jaukovic,  B.Sc.,  Vesna  Lazarevic,  B.Sc.,  Stevan  Rakovic,  machine  technician, 
Dusanka  Glavac,  chemical  technician  and  Ljiljana  Borkovic. 


Morristown,  February  2004 


Zivorad  Lazic 


I 

Introduction  to  Statistics  for  Engineers 


Natural  processes  and  phenomena  are  conditioned  by  interaction  of  various  factors. 
By  dealing  with  studies  of  cause-factor  and  phenomenon-response  relationships, 
science  to  varying  degrees,  has  succeeded  in  penetrating  into  the  essence  of  phe- 
nomena and  processes.  Exact  sciences  can,  by  the  quality  of  their  knowledge,  be 
ranked  into  three  levels.  The  top  level  is  the  one  where  all  factors  that  are  part  of  an 
observed  phenomenon  are  known,  as  well  as  the  natural  law  as  the  model  by  which 
they  interact  and  thus  realize  the  observed  phenomenon.  The  relationship  of  all  fac- 
tors in  natural-law  phenomenon  is  given  by  a formula-mathematical  model.  To  give 
an  example,  the  following  generally  known  natural  laws  can  be  cited: 

mw2 

E = — ; F = ma  ; S = vt  ; U = IR  : Q = FW 

The  second  group,  i.e.  at  a slightly  lower  level,  is  the  one  where  all  factors  that  are 
part  of  an  observed  phenomenon  are  known,  but  we  know  or  are  only  partly  aware 
of  their  interrelationships,  i.e.  influences.  This  is  usually  the  case  when  we  are  faced 
with  a complex  phenomenon  consisting  of  numerous  factors.  Sometimes  we  can 
link  these  factors  as  a system  of  simultaneous  differential  equations  but  with  no  so- 
lutions to  them.  As  an  example  we  can  cite  the  Navier- Stokes’  simultaneous  system 
of  differential  equations,  used  to  define  the  flow  of  an  ideal  fluid: 
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An  an  even  lower  level  of  knowledge  of  a phenomenon  is  the  case  when  only  a 
certain  number  of  factors  that  are  part  of  a phenomenon  are  known  to  us,  i.e.  there 
exists  a large  number  of  factors  and  we  are  not  certain  of  having  noticed  all  the  vari- 
ables. At  this  level  we  do  not  know  the  natural  law,  i.e.  the  mathematical  model  by 
which  these  factors  act.  In  this  case  we  use  experiment  (empirical  research)  in  order 
to  reach  the  noticed  natural  law. 
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As  an  example  of  this  level  of  knowledge  about  a phenomenon  we  can  cite  the 
following  empirical  dependencies  Darcy-Weisbah’s  law  on  drop  of  fluid  pressure 
when  flowing  through  a pipe  [1]: 


Ap  = lDP 


W 2 
2 


Ergun’s  equation  on  drop  of  fluid  pressure  when  flowing  through  a bed  of  solid 
particles  [1]: 


A p 
H 


= 150 


1— e\2uf„  „ 1— e rW 2 
' ^w+l ,75^p/  — 
dp  e dp 


The  equation  defining  warming  up  or  cooling  of  fluid  flows  inside  or  outside  a 
pipe  without  phase  changes  [1]: 


a 

cG 


i o.67 


LH 

~D 


pST 
. h . 


= 1.86/ 


(f) 


0.67 


The  first  case  is  quite  clear:  it  represents  deterministic  and  functional  laws,  while 
the  second  and  third  levels  are  examples  of  stochastic  phenomena  defined  by  sto- 
chastic dependencies.  Stochastic  dependency,  i.e.  natural  law  is  not  expressed  in  in- 
dividual cases  but  it  shows  its  functional  connection  only  when  these  cases  are  ob- 
served as  a mass.  Stochastic  dependency,  thus,  contains  two  concepts:  the  function 
discovered  in  a mass  of  cases  as  an  average,  and  smaller  or  greater  deviations  of  in- 
dividual cases  from  that  relationship. 

The  lowest  level  in  observing  a phenomenon  is  when  we  are  faced  with  a totally 
new  phenomenon  where  both  factors  and  the  law  of  changes  are  unknown  to  us, 
i.e.  outcomes-responses  of  the  observed  phenomenon  are  random  values  for  us. 
This  randomness  is  objectively  a consequence  of  the  lack  of  ability  to  simultaneously 
observe  all  relations  and  influences  of  all  factors  on  system  responses.  Through  its 
development  science  continually  discovers  new  connections,  relationships  and  fac- 
tors, which  brings  about  shifting  up  the  limits  between  randomness  and  lawfulness. 

Based  on  the  mentioned  analysis  one  can  conclude  that  stochastic  processes  are 
phenomena  that  are  neither  completely  random  not  strictly  determined,  i.e.  random 
and  deterministic  phenomena  are  the  left  and  right  limits  of  stochastic  phenomena. 
In  order  to  find  stochastic  relationships  the  present-day  engineering  practice  uses, 
apart  from  others,  experiment  and  statistical  calculation  of  obtained  results. 

Statistics,  the  science  of  description  and  interpretation  of  numerical  data,  began 
in  its  most  rudimentary  form  in  the  census  and  taxation  of  ancient  Egypt  and  Baby- 
lon. Statistics  progressed  little  beyond  this  simple  tabulation  of  data  until  the  theo- 
retical developments  of  the  eighteenth  and  nineteenth  centuries.  As  experimental 
science  developed,  the  need  grew  for  improved  methods  of  presentation  and  analy- 
sis of  numerical  data. 

The  pioneers  in  mathematical  statistics,  such  as  Bernoulli,  Poisson,  and  Laplace, 
had  developed  statistical  and  probability  theory  by  the  middle  of  the  nineteenth  cen- 
tury. Probably  the  first  instance  of  applied  statistics  came  in  the  application  of  prob- 
ability theory  to  games  of  chance.  Even  today,  probability  theorists  frequently  choose 
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a coin  or  a deck  of  cards  as  their  experimental  model.  Application  of  statistics  in 
biology  developed  in  England  in  the  latter  half  of  the  nineteenth  century.  The  first 
important  application  of  statistics  in  the  chemical  industry  also  occurred  in  a factory 
in  Dublin,  Ireland,  at  the  turn  of  the  century.  Out  of  the  need  to  approach  solving 
some  technological  problems  scientifically,  several  graduate  mathematicians  from 
Oxford  and  Cambridge,  including  W.  S.  Gosset,  were  engaged.  Having  accepted  the 
job  in  1899,  Gosset  applied  his  knowledge  in  mathematics  and  chemistry  to  control 
the  quality  of  finished  products.  His  method  of  small  samples  was  later  applied  in 
all  fields  of  human  activities.  He  published  his  method  in  1907  under  the  pseudo- 
nym “Student’ , known  as  such  even  these  days.  This  method  had  been  applied  to  a 
limited  level  in  industry  up  to  1920.  Larger  applications  were  registered  during 
World  War  Two  in  military  industries.  Since  then  statistics  and  probability  theory 
are  being  applied  in  all  fields  of  engineering. 

With  the  development  of  electronic  computers,  statistical  methods  began  to  thrive 
and  take  an  ever  more  important  role  in  empirical  researches  and  system  optimization. 

Statistical  methods  of  researching  phenomena  can  be  divided  into  two  basic 
groups.  The  first  one  includes  methods  of  recording  and  processing-description  of 
variables  of  observed  phenomena  and  belongs  to  Descriptive  statistics.  As  a result  of 
applying  descriptive  statistics  we  obtain  numerical  information  on  observed  phe- 
nomena, i.e.  statistical  data  that  can  be  presented  in  tables  and  graphs.  The  second 
group  is  represented  by  statistical  analysis  methods  the  task  of  which  is  to  clarify  the 
observed  variability  by  means  of  classification  and  correlation  indicators  of  statistic 
series.  This  is  the  field  of  that  Inferential  statistics,  however,  cannot  be  strictly  set 
apart  from  descriptive  statistics. 

The  subject  of  statistical  researches  are  the  Population  (universe,  statistical  masses, 
basic  universe,  completeness)  and  samples  taken  from  a population.  The  population 
must  be  representative  of  a collection  of  a continual  chemical  process  by  some  fea- 
tures, i.e.  properties  of  the  given  products.  If  we  are  to  find  a property  of  a product, 
we  have  to  take  out  a sample  from  a population  that,  by  mathematical  statistics  the- 
ory is  usually  an  infinite  gathering  of  elements-units. 

For  example,  we  can  take  each  hundredth  sample  from  a steady  process  and 
expose  it  to  chemical  analysis  or  some  other  treatment  in  order  to  establish  a certain 
property  (taking  a sample  from  a chemical  reactor  with  the  idea  of  establishing  the 
yield  of  chemical  reaction,  taking  a sample  out  of  a rocket  propellant  with  the  idea 
of  establishing  mechanical  properties  such  as  tensile  strength,  elongation  at  break, 
etc.).  After  talcing  out  a sample  and  obtaining  its  properties  we  can  apply  descriptive 
statistics  to  characterize  the  sample.  However,  if  we  wish  to  draw  conclusions  about 
the  population  from  the  sample,  we  must  use  methods  of  statistical  inference. 

What  can  we  infer  about  the  population  from  our  sample?  Obviously  the  sample 
must  be  a representative  selection  of  values  taken  from  the  population  or  else  we 
can  infer  nothing.  Hence,  we  must  select  a random  sample. 

A random  sample  is  a collection  of  values  selected  from  a population  of  values  in 
such  a way  that  each  value  in  the  population  had  an  equal  chance  of  being  selected 

Often  the  underlying  population  is  completely  hypothetical.  Suppose  we  make 
five  runs  of  a new  chemical  reaction  in  a batch  reactor  at  constant  conditions,  and 
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then  analyze  the  product.  Our  sample  is  the  data  from  the  five  runs;  but  where  is 
the  population?  We  can  postulate  a hypothetical  population  of  “all  runs  made  at 
these  conditions  now  and  in  the  future”.  We  take  a sample  and  conclude  that  it  will 
be  representative  of  a population  consisting  of  possible  future  runs,  so  the  popula- 
tion may  well  be  infinite. 

If  our  inferences  about  the  population  are  to  be  valid,  we  must  make  certain  that 
future  operating  conditions  are  identical  with  those  of  the  sample. 

For  a sample  to  be  representative  of  the  population,  it  must  contain  data  over  the 
whole  range  of  values  of  the  measured  variables.  We  cannot  extrapolate  conclusions 
to  other  ranges  of  variables.  A single  value  computed  from  a series  of  observations 
(sample)  is  called  a “statistic” . 


Mean,  median  and  mode  as  measures  of  location 

By  sample  mean  X we  understand  the  value  that  is  the  arithmetic  average  of  prop- 
erty values  X-^ , X^ , X3 , , X^.  When  we  say  average,  we  are  frequently  referring  to  the 
sample  mean,  which  is  defined  as  the  sum  of  all  the  values  in  the  sample  divided  by 
the  number  of  values  in  the  sample.  A sample  mean-average  is  the  simplest  and 
most  important  of  all  data  measures  of  location. 


£ 

n 


(l.i) 


where: 

X is  the  sample  mean-average  of  the  n-values, 

X;  is  any  given  value  from  the  sample. 

The  symbol  X is  the  symbol  used  for  the  sample  mean.  It  is  an  estimate  of  the 
value  of  the  mean  of  the  underlying  population,  which  is  designated  p.  We  can 
never  determine  p exactly  from  the  sample,  except  in  the  trivial  case  where  the  sam- 
ple includes  the  entire  population  but  we  can  quite  closely  estimate  it  based  on  sam- 
ple mean.  Another  average  that  is  frequently  used  for  measures  of  location  is  the 
median.  The  median  is  defined  as  that  observation  from  the  sample  that  has  the 
same  number  of  observations  below  it  as  above  it.  Median  is  defined  as  the  central 
observation  of  a sample  where  values  are  in  the  array  by  sizes. 

A third  measure  of  location  is  the  mode,  which  is  defined  as  that  value  of  the  mea- 
sured variable  for  which  there  are  the  most  observations.  Mode  is  the  most  probable 
value  of  a discrete  random  variable,  while  for  a continual  random  variable  it  is  the 
random  variable  value  where  the  probability  density  function  reaches  its  maximum. 
Practically  speaking,  it  is  the  value  of  the  measured  response,  i.e.  the  property  that 
is  the  most  frequent  in  the  sample.  The  mean  is  the  most  widely  used,  particularly 
in  statistical  analysis.  The  median  is  occasionally  more  appropriate  than  the  mean 
as  a measure  of  location.  The  mode  is  rarely  used.  For  symmetrical  distributions, 
such  as  the  Normal  distribution,  the  mentioned  values  are  identical. 
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Example  1.1  [2] 

As  an  example  of  the  differences  among  the  three  measures  of  location,  let  us  con- 
sider the  salary  levels  in  a small  company.  The  annual  salaries  are: 


President  50.000 

Salesman  15.000 

Accountant  8.000 

Foreman  7.000 

Two  technicians,  each  6.000 

Four  workmen,  each  4.000 


If  the  given  salaries  are  put  in  the  array  we  get: 

4.000;  4.000;  4.000;  4.000;  A000  ; 6.000;  7.000;  8.000;  15.000;  50.000 
mode  median 

During  salary  negotiations  with  the  company  union,  the  president  states  that  the 
average  salary  among  the  10  employees  is  9000 $/yr,  and  there  is  certainly  no  need 
for  a raise.  The  union  representative  states  that  there  is  a great  need  for  a raise 
because  over  half  of  the  employees  are  earning  6000$  or  less,  and  that  more  men 
are  making  4000$  than  anything  else.  Clearly,  the  president  has  used  the  mean;  and 
the  union,  the  median  and  mode. 


Measures  of  variability,  the  range,  the  mean  deviation  and  variance 

As  we  can  see,  mean  or  average,  median  and  mode  are  measure  of  Location.  Having 
determined  the  location  of  our  data,  we  might  next  ask  how  the  data  are  spread  out  about 
mean.  The  simplest  measure  of  variability  is  range  or  interval.  The  range  is  defined  as 
the  difference  between  the  largest  and  smallest  values  in  the  sample. 

(interval-range)  = Xmax-Xmin  (1.2) 

This  measure  can  be  calculated  easily  but  it  offers  only  an  approximate  measure 
of  variability  of  data  as  it  is  influenced  only  by  the  limit  values  of  observed  property 
that  can  be  quite  different  from  other  values.  For  a more  precise  measure  of  variabil- 
ity we  have  to  include  all  property-response  values,  i.e.  from  all  their  deviations  from 
the  sample  mean,  mostly  the  average.  As  the  mean  of  the  values  of  deviation  from 
the  sample  mean  is  equal  to  null,  we  can  take  as  measures  of  variability  the  mean 
deviation.  The  mean  deviation  is  defined  as  the  mean  of  the  absolute  values  of  devia- 
tion from  the  sample  mean: 

N 

m = ^EIX<-*l  (!-3) 

1=1 

The  most  popular  method  of  reporting  variability  is  the  sample  variance,  defined  as: 


E (*-x)2 


ft— 1 


(1.4) 


6 | / Introduction  to  Statistics  for  Engineers 

A useful  calculation  formula  is: 

4 _■£*?-(£*)■  ,1.5, 

^ n(n—  1) 

The  sample  variance  is  essentially  the  sum  of  the  squares  of  the  deviation  of  the 
data  points  from  the  mean  value  divided  by  (n-1).  A large  value  of  variance  indicates 
that  the  data  are  widely  spread  about  the  mean.  In  contrast,  if  all  values  for  the  data 
points  were  nearly  the  same,  the  sample  variance  would  be  very  small.  The  standard 
deviation  sx  is  defined  as  the  square  root  of  the  variance.  The  standard  deviation  is 
expressed  by  the  same  units  as  random  variable  values.  Both  standard  deviation  and 
the  average  are  expressed  by  the  same  units.  This  characteristic  made  it  possible  to 
mutually  compare  variability  of  different  distributions  by  introducing  the  relative 
measure  of  variability,  called  the  coefficient  of  variation: 

fev=X  = X100%  (1-6) 

A large  value  of  variation  coefficient  indicates  that  the  data  are  widely  spread 
about  the  mean.  In  contrast,  if  all  values  for  the  data  points  were  nearly  the  same, 
the  variation  coefficient  would  be  very  small. 

Example  1.2  [2] 

Suppose  we  took  ten  different  sets  of  five  random  observations  on  X and  then  calcu- 
lated sample  means  and  variances  for  each  of  the  ten  groups. 


Sample 

Value 

Sample  mean 

Sample  variance 

1 

1;0;4;8;0 

2.6 

11.8 

2 

2;2;3;6;8 

4.2 

7.2 

3 

2;4;1;3;0 

2.0 

2.5 

4 

4;2;1;6;7 

4.0 

6.5 

5 

3;7;5;7;0 

4.4 

8.8 

6 

7;7;9;2;1 

5.2 

12.2 

7 

9;9;5;6;2 

6.2 

8.7 

8 

9;6;0;3;1 

3.8 

13.7 

9 

8;9;5;7;9 

7.6 

2.8 

10 

8;5;4;7;5 

5.8 

2.7 

Means 

4.58 

7.69 

We  would  have  ten  different  values  of  sample  variance.  It  can  be  shown  that  these 

2 

values  would  have  a mean  value  nearly  equal  to  the  population  variance  Ox-  Similar- 
ly, the  mean  of  the  sample  means  will  be  nearly  equal  to  the  population  mean 
[X.  Strictly  speaking,  our  ten  groups  will  not  give  us  exact  values  for  ax  and  (t.  To 
obtain  these,  we  would  have  to  take  an  infinite  number  of  groups,  and  hence  our 
sample  would  include  the  entire  infinite  population,  which  is  defined  in  statistics  as 
Glivenko’s  theorem  [3]. 
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To  illustrate  the  difference  between  values  of  sample  estimates  and  population  pa- 
rameters, consider  the  ten  groups  of  five  numbers  each  as  shown  in  the  table.  The 
sample  means  and  sample  standard  deviations  have  been  calculated  from  appropri- 
ate formulas  and  tabulated.  Usually  we  could  calculate  no  more  than  that  these  val- 

2 

ues  are  estimates  of  the  population  parameters  p and  ax,  respectively.  However  in 
this  case,  the  numbers  in  the  table  were  selected  from  a table  of  random  numbers 
ranging  from  0 to  9 - Table  A.  In  such  a table  of  random  numbers,  even  of  infinite 
size,  the  proportion  of  each  number  is  equal  to  1/10.  This  equal  proportion  permits 
us  to  evaluate  the  population  parameters  exactly: 

0+1+2+3+4+5+6+7+8+9 

To = 4-50  : 

A - _ 

10 

We  can  now  see  that  our  sample  means  in  the  ten  groups  scatter  around  the  pop- 
ulation mean.  The  mean  of  the  ten  group-means  is  4.58,  which  is  close  to  the  popu- 
lation mean.  The  two  would  be  identical  if  we  had  an  infinite  number  of  groups. 
Similarly,  the  sample  variances  scatter  around  the  population  variance,  and  their 
mean  of  7.69  is  close  to  the  population  variance. 

What  we  have  done  in  the  table  is  to  take  ten  random  samples  from  the  infinite 
population  of  numbers  from  0 to  9.  In  this  case,  we  know  the  population  parameters 
so  that  we  can  get  an  idea  of  the  accuracy  of  our  sample  estimates. 


Problem  1.1 

From  the  table  of  random  numbers  take  20  different  sample  data 
with  10  random  numbers.  Determine  the  sample  mean  and  sample 
variance  for  each  sample.  Calculate  the  average  of  obtained  “statis- 
tics” and  compare  them  to  population  parameters. 


1.1 

The  Simplest  Discrete  and  Continuous  Distributions 

In  analyzing  an  engineering  problem,  we  frequently  set  up  a mathematical  model 
that  we  believe  will  describe  the  system  accurately.  Such  a model  may  be  based  on 
past  experience,  on  intuition,  or  on  a theory  of  the  physical  behavior  of  the  system. 

Once  the  mathematical  model  is  established,  data  are  taken  to  verify  or  reject  it. 
For  example,  the  perfect  gas  law  (PV  = nRT)  is  a mathematical  model  that  has  been 
found  to  describe  the  behavior  of  a few  real  gases  at  moderate  conditions.  It  is  a 
“law”  that  is  frequently  violated  because  our  intuitive  picture  of  the  system  is  too 
simple. 

In  many  engineering  problems,  the  physical  mechanism  of  the  system  is  too  complex 
and  not  sufficiently  understood  to  permit  the  formulation  of  even  an  approximately 
accurate  model,  such  as  the  perfect  gas  law.  However,  when  such  complex  systems 
are  in  question,  it  is  recommended  to  use  statistical  models  that  to  a greater  or  less- 
er, but  always  well-known  accuracy,  describe  the  behavior  of  a system. 
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In  this  chapter,  we  will  consider  probability  theory,  which  provides  the  simple  sta- 
tistical models  needed  to  describe  the  drawing  of  samples  from  a population,  i.e. 
simple  probability  models  are  useful  in  describing  the  presumed  population  under- 
lying a random  sample  of  data.  Among  the  most  important  concepts  of  probability 
theory  is  the  notion  of  random  variable.  As  realization  of  each  random  event  can  be 
numerically  characterized,  the  various  values,  which  take  those  numbers  as  definite 
probabilities,  are  called  random  variables.  A random  variable  is  often  defined  as  a 
function  that  to  each  elementary  event  assigns  a number.  Thus,  influenced  by  ran- 
dom circumstances  a random  variable  can  take  various  numerical  values.  One  can- 
not tell  in  advance  which  of  those  values  the  random  variable  will  take,  for  its  values 
differ  with  different  experiments,  but  one  can  in  advance  know  all  the  values  it  can 
take.  To  characterize  a random  variable  completely  one  should  know  not  only  what 
values  it  can  take  but  also  how  frequently,  i.e.  what  the  probability  is  of  taking  those 
values.  The  number  of  different  values  a random  variable  takes  in  a given  experi- 
ment can  be  final.  If  random  variable  takes  a finite  number  of  values  with  corre- 
sponding probabilities  it  is  called  a discrete  random  variable.  The  number  of  defective 
products  that  are  produced  during  a working  day,  the  number  of  heads  one  gets 
when  tossing  two  coins,  etc.,  are  the  discrete  random  variables.  The  random  variable 
is  continuous  if,  with  corresponding  probability,  it  can  take  any  numerical  value  in  a 
definite  range.  Examples  of  continuous  random  variables:  waiting  time  for  a bus, 
time  between  emission  of  particles  in  radioactive  decay,  etc. 

The  simplest  probability  model 

Probability  theory  was  originally  developed  to  predict  outcomes  of  games  of  chance. 
Hence  we  might  start  with  the  simplest  game  of  chance:  a single  coin.  We  intuitively 
conclude  that  the  chance  of  the  coin  coming  up  heads  or  tails  is  equally  possible. 
That  is,  we  assign  a probability  of  0.5  to  either  event.  Generally  the  probabilities  of 
all  possible  events  are  chosen  to  total  1.0. 

If  we  toss  two  coins,  we  note  that  the  fall  of  each  coin  is  independent  of  the  other. 
The  probability  of  either  coin  landing  heads  is  thus  still  0.5.  The  probability  of  both 
coins  falling  heads  is  the  product  of  the  probabilities  of  the  single  events,  since  the 
single  events  are  independent: 

P (both  heads)  = 0.50  x 5 = 0.25 

Similarly,  the  probability  of  100  coins  all  falling  heads  is  extremely  small: 

P (100  heads)  = 0.5100 

A single  coin  is  an  example  of  a “Bernoulli"  distribution.  This  probability  distribu- 
tion limits  values  of  the  random  variable  to  exactly  two  discrete  values,  one  with 
probability  p,  and  the  other  with  the  probability  (1-p).  For  the  coin,  the  two  values 
are  heads  p,  and  tails  (1-p),  where  p=  0.5  for  a “fair”  coin. 

The  Bernoulli  distribution  applies  wherever  there  are  just  two  possible  outcomes 
for  a single  experiment.  It  applies  when  a manufactured  product  is  acceptable  or 
defective;  when  a heater  is  on  or  off;  when  an  inspection  reveals  a defect  or  does  not. 
The  Bernoulli  distribution  is  often  represented  by  1 and  0 as  the  two  possible  out- 
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comes,  where  1 might  represent  heads  or  product  acceptance  and  0 would  represent 
tails  or  product  rejection. 

Mean  and  variance 

The  tossing  of  a coin  is  an  experiment  whose  outcome  is  a random  variable.  Intui- 
tively we  assume  that  all  coin  tosses  occur  from  an  underlying  population  where  the 
probability  of  heads  is  exactly  0.5.  However,  if  we  toss  a coin  100  times,  we  may  get 
54  heads  and  46  tails.  We  can  never  verify  our  intuitive  estimate  exactly,  although 
with  a large  sample  we  may  come  very  close. 

How  are  the  experimental  outcomes  related  to  the  population  mean  and  variance? 

A useful  concept  is  that  of  the  “expected  value”.  The  expected  value  is  the  sum  of  all 
possible  values  of  the  outcome  of  an  experiment,  with  each  value  weighted  with  a 
probability  of  obtaining  that  outcome.  The  expected  value  is  a weighted  average. 

The  “mean”  of  the  population  underlying  a random  variable  X is  defined  as  the 
expected  value  of  X: 

(x=E(X)  = £XiPl  (1.7) 

where: 

[X  is  the  population  mean; 

E(X)  is  the  expected  value  of  X; 

By  appropriate  manipulation,  it  is  possible  to  determine  the  expected  value  of  var- 
ious functions  of  X,  which  is  the  subject  of  probability  theory.  For  example,  the 
expected  value  of  X is  simply  the  sum  of  squares  of  the  values,  each  weighted  by  the 
probability  of  obtaining  the  value. 

The  population  variance  of  the  random  variable  X is  defined  as  the  expected  value 
of  the  square  of  the  difference  between  a value  of  X and  the  mean: 

o2  = E(X  - (x)2  (1.8) 

a2  = e(x2  - 2X(x  + p2)  = E(x2^j  - 2p£(X)  + p2  (1-9) 

As  E(X)  = |x,  we  get: 

a2  = e(x 2)  - |x2  (1.10) 

By  using  the  mentioned  relations  for  Bernoulli’s  distribution  we  get: 

E(X)  = £XiPi  = (p)(l)  + (1  -p)(0)  =p  (1.11) 

e(x2)  =E^i  = w(i2)  +(i-p)(o2)  =p 

So  that  |x=  p,  and  a2  = p-p2  for  the  coin  toss: 
p=  0.5;  |x=  0.5;  a2  = 0.25; 


(1.12) 
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Discrete  Distributions 


A discrete  distribution  function  assigns  probabilities  to  several  separate  outcomes  of 
an  experiment.  By  this  law,  the  total  probability  equal  to  number  one  is  distributed 
to  individual  random  variable  values.  A random  variable  is  fully  defined  when  its 
probability  distribution  is  given.  The  probability  distribution  of  a discrete  random 
variable  shows  probabilities  of  obtaining  discrete-interrupted  random  variable  val- 
ues. It  is  a step  function  where  the  probability  changes  only  at  discrete  values  of  the 
random  variable.  The  Bernoulli  distribution  assigns  probability  to  two  discrete  out- 
comes (heads  or  tails;  on  or  off;  1 or  0,  etc.).  Hence  it  is  a discrete  distribution. 

Drawing  a playing  card  at  random  from  a deck  is  another  example  of  an  experiment 
with  an  underlying  discrete  distribution,  with  equal  probability  (1/52)  assigned  to 
each  card.  For  a discrete  distribution,  the  definition  of  the  expected  value  is: 

E(X)  = ZXiPi  (1.13) 

where: 

X;is  the  value  of  an  outcome,  and 

pi  is  the  probability  that  the  outcome  will  occur. 

The  population  mean  and  variance  defined  here  may  be  related  to  the  sample 
mean  and  variance,  and  are  given  by  the  following  formulas: 

E{X)  =E{J2Xi/n)  =E£(x;)/n  = Ell/»=w(Vw  = ll  (1-14) 


E(X)  = \i  (1.15) 

Equation  (1.15)  shows  that  the  expected  value  (or  mean)  of  the  sample  means  is 
equal  to  the  population  mean. 

The  expected  value  of  the  sample  variance  is  found  to  be  the  population  variance: 
'21 

(1.16) 


(1.17) 


= E 

E(E-: 

ft— 1 

Since: 

E(X;-X): 

2=Ex- 

we  find  that: 

£(EX‘)-»qx- 


ft— 1 


£EX2-n£(X 
n—  1 


(1.18) 


It  can  be  shown  that: 

E(V)  =o2  +\i2;E(X)2=?-+  p2 


so  that: 


2 . 2l  2 2 

n a +[x  —a  —n\i 


(1.19) 


n— 1 


(1.20) 
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and  finally: 

e(s2)  = a (1.21) 


The  definition  of  sample  variance  with  an  (n-1)  in  the  denominator  leads  to  an 
unbiased  estimate  of  the  population  variance,  as  shown  above.  Sometimes  the  sam- 
ple variance  is  defined  as  the  biased  variance: 


,2  _E(x,-x)2 


(1.22) 


So  that  in  this  case: 


E{s2)=nzlai 

/ n 


(1.23) 


A more  useful  and  more  frequently  used  distribution  is  the  binomial  distribution. 
The  binomial  distribution  is  a generalization  of  the  Bernoulli  distribution.  Suppose 
we  perform  a Bernoulli-type  experiment  a finite  number  of  times.  In  each  trial, 
there  are  only  two  possible  outcomes,  and  the  outcome  of  any  trial  is  independent  of 
the  other  trials.  The  binomial  distribution  gives  the  probability  of  k identical  out- 
comes occurring  in  n trials,  where  any  one  of  the  k outcomes  has  the  probability  p 
of  occurring  in  any  one  (Bernoulli)  trial: 


p(x=k)=QPk(i-Pri 


(1.24) 


The  symbol  n over  k is  referred  to  as  the  combination  of  n items  taken  k at  a 
time.  It  is  defined  as: 


fc!(n— k)\ 


(1.25) 


Example  1.3 

Suppose  we  know  that,  on  the  average,  10%  of  the  items  produced  in  a process  are 
defective.  What  is  the  probability  that  we  will  get  two  defective  items  in  a sample  of 
ten,  selected  randomly  from  the  products,  drawn  randomly  from  the  product  popu- 
lation? 


Here,  n=  10;  k=  2;  p=  0.1,  so  that: 

P(X  = 2)  = fl°)  x (0.1)2  x (0.9)8=  0.1938 

The  chances  are  about  19  out  of  100  that  two  out  of  ten  in  the  sample  are  defec- 
tive. On  the  other  hand,  chances  are  only  one  out  of  ten  billion  that  all  ten  would  be 
found  defective.  Values  of  P(X=k)  for  other  values  may  be  calculated  and  plotted  to 
give  a graphic  representation  of  the  probability  distribution  Fig.  1.1. 
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Figure  1.1  Binomial  distribution  for  p = 0.1  and  n = 10. 


Table  1.1  Di  screte  distributions. 


Distributions 

Mean 

Variance 

Model 

Example 

Bernoulli 

P 

P(l-P) 

Single  experiment, 

Heads  or  tails  with  a coin 

Xj=  1 with  p ; xA=  0 with  (1-p) 

two  possible  outcomes 

Binomial 

np 

np(l-  P) 

n Bernoulli  experi- 

Number  of  defective 

P(x=,t)  = {l)p\l-k)n-k 

ments  with  k out- 

items  in  a sample  drawn 

comes  of  one  kind 

without  replacement 

Hypergeometric 

p(^  = (f)P-r)/(^) 

nxM 

nxM(N—M)(N—n) 

M objects  of  one  kind, 
N objects  of  another 

from  a finite  population 

N 

N2(N-1 ) 

Number  of  defective 

kind,  lc  objects  of 

items  in  a sample  drawn 

kind  M found  in  a 

without  replacement 

drawing  of  n objects. 
The  n objects 
are  drawn  from  the 
population  without 
replacement  after  each 

from  a finite  population. 

Geometric 

1 -P 

1 -P 

drawing. 

Number  of  failures 

Number  of  tails  before 

p(x=t)  =p(1  ~p)k 

P 

P 

before  the  first 

the  first  head. 

success  in  a sequence 
of  Bernoulli  trials. 

Poisson 

Xt 

Xt 

Random  occurrence 

Radioactive  decay,  equip- 

P(x=t)  = e-U(Xt)k/kl 

with  time.  Probability 
of  k occurrences  in 
interval  of  width  t.  X is 
a constant  parameter 

ment  breakdown 
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One  defective  item  in  a sample  is  shown  to  be  most  probable;  but  this  sample 
proportion  occurs  less  than  four  times  out  of  ten,  even  though  it  is  the  same  as  the 
population  proportion.  In  the  previous  example,  we  would  expect  about  one  out  of 
ten  sampled  items  to  be  defective.  We  have  intuitively  taken  the  population  propor- 
tion (p  = 0.1)  to  be  the  expected  value  of  the  proportion  in  the  random  sample.  This 
proves  to  be  correct.  It  can  be  shown  that  for  the  binomial  distribution: 

[i  = np;  a2  = np(l  — p)  (1.26) 

Thus  for  the  previous  example: 

p,  = 10  x 0.1  = 1;  a2  = 10  x 0.1  x 0.9  = 0.9 


Example  1.4  [4] 

The  probability  that  a compression  ring  fitting  will  fail  to  seal  properly  is  0.1.  What 
is  the  expected  number  of  faulty  rings  and  their  variance  if  we  have  a sample  of  200 
rings? 

Assuming  that  we  have  a binomial  distribution,  we  have: 

| x = nx  p = 200  x 0.1  = 20;  o2  = np  x (1  — p)  = 200  x 0.1  x 0.9  = 18 

A number  of  other  discrete  distributions  are  listed  in  Table  1.1,  along  with  the 
model  on  which  each  is  based.  Apart  from  the  mentioned  discrete  distribution  of 
random  variable  hypergeometrical  is  also  used.  The  hypergeometric  distribution  is 
equivalent  to  the  binomial  distribution  in  sampling  from  infinite  populations.  For 
finite  populations,  the  binomial  distribution  presumes  replacement  of  an  item 
before  another  is  drawn;  whereas  the  hypergeometric  distribution  presumes  no  re- 
placement. 

1.1.2 

Continuous  Distribution 


A continuous  distribution  function  assigns  probability  to  a continuous  range  of  val- 
ues of  a random  variable.  Any  single  value  has  zero  probability  assigned  to  it.  The 
continuous  distribution  may  be  contrasted  with  the  discrete  distribution,  where 
probability  was  assigned  to  single  values  of  the  random  variable.  Consequently,  a 
continuous  random  variable  cannot  be  characterized  by  the  values  it  takes  in  corre- 
sponding probabilities.  Therefore  in  a case  of  continuous  random  variable  we 
observe  the  probability  P(x  < X < Ax)  that  it  takes  values  from  the  range  (x,x+Ax), 
where  Ax  can  be  an  arbitrarily  small  number.  The  deficiency  of  this  probability  is 
that  it  depends  on  Ax  and  has  a tendency  to  zero  when  Ax  — >0.  In  order  to  overcome 
this  deficiency  let  us  observe  the  function: 


f{x)  = lim 

Ax— >0 


P(x<X<x+ Ax) 
Ax 


;/(*)  = 


d P(x) 


(1.27) 


dx 
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which  does  not  depend  on  Ax  and  that  is  called  the  probability  density  function  of  con- 
tinuous random  variable  X.  The  probability  that  the  random  variable  lies  between 
any  two  specific  values  in  a continuous  distribution  is: 


b 


P(a  <X<  b) 


fix) dx 


(1.28) 


where  f(x)  is  the  probability  density  function  of  the  underlying  population  model. 
Since  all  values  of  X lie  between  minus  infinity  and  plus  infinity  [— oo;+oo],  the 
probability  of  finding  X within  these  limits  is  1.  Hence  for  all  continuous  distribu- 
tions: 


+oo 

f(X)dx  = 1 (1.29) 

— OO 

The  expected  value  of  a continuous  distribution  is  obtained  by  integration,  in  con- 
trast to  the  summation  required  for  discrete  distributions.  The  expected  value  of  the 
random  variable  X is  defined  as: 


E(X)  = 


xf{x)  dx 


(1.30) 


The  quantity  f(x)dx  is  analogous  to  the  discrete  p(x)  defined  earlier  so  that  Equa- 
tion (1.30)  is  analogous  to  Equation  (1.13).  Equation  (1.30)  also  defines  the  mean  of 
a continuous  distribution,  since  |t=  E(X).  The  variance  is  defined  as: 


2 

a = 


[x  - n)  f(x) dx 


(1.31) 


or  by  expression: 


+oo 

”+oo 

2 

a = 

xf fix)  dx  — 

| xf{x)dx 

— c 

30 

— OO 

(1.32) 


The  simplest  continuous  distribution  is  the  uniform  distribution  that  assigns  a 
constant  density  function  over  a region  of  values  from  a to  b,  and  assigns  zero  prob- 
ability to  all  other  values  of  the  random  variable  Figure  1.2. 

The  probability  density  function  for  the  uniform  distribution  is  obtained  by  inte- 
grating over  all  values  of  x,  with  f(x)  constant  between  a and  b,  and  zero  outside  of 
the  region  between  a and  b: 


+oo 


b 


fix)  dx 


fix)  dx  = 1 


— OO 


(1.33) 
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After  integrating  this  relation,  we  get: 


/(*)  =T~  = ~hhx  = const 

fdx 

a 

f(X) 


(1.34) 


0 

0 a 


f(X)= 


1 


b-a 


X 


Figure  1.2  Uniform  Distribution 


Next  to  follow  is: 


li  = E(X)  = 


xdx  _ i(fl+  }jj 


a 2 


We  also  get  that: 
b 


2 

a = 


% dx 


(b  — a) 
Vl~ 


(1.35) 


(1.36) 


Example  1.5 

As  an  example  of  a uniform  distribution,  let  us  consider  the  chances  of  catching  a 
city  bus  knowing  only  that  the  buses  pass  a given  corner  every  15  min.  On  the  aver- 
age, how  long  will  we  have  to  wait  for  the  bus?  How  likely  is  it  that  we  will  have  to 
wait  at  least  10  min.? 

The  random  variable  in  this  example  is  the  time  T until  the  next  bus.  Assuming 
no  knowledge  of  the  bus  schedule,  T is  uniformly  distributed  from  0 to  15  min.  Here 
we  are  saying  that  the  probabilities  of  all  times  until  the  next  bus  are  equal.  Then: 


/(t) 


11 
15-0  _ 15 


The  average  wait  is: 


E(T) 


15 

\ t dt 
o 


15 


= 7.5 
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Waiting  at  least  10  min.  implies  that  T lies  between  10  and  15,  so  that  by 
Eq.  (1.28): 

15  M 1 

P(10<  T<  15)  = f — = - 
10  ^ ^ 

In  only  one  case  out  of  three  will  we  need  to  wait  10  min.  or  more.  The  probability 
that  we  will  have  to  wait  exactly  10  min.  is  zero,  since  no  probability  is  assigned  to 
specific  values  in  continuous  distributions.  Characteristics  of  several  continuous  dis- 
tributions are  given  in  Table  1.2. 

Table  1.2  Continuous  distributions 


Distribution  and  density  Mean  Variance  Model 


Example 


Uniform 

f(x)  = l/(b  — o)a<x<b 
Negative  exponentials 
f ( x ) = Xe  kx  ; x > 0 


Normal 

/(x)  = o72SeXp 


1 /X—  |X\2 

2 / 


i(o  + b) 

1 

l 


(b-af 

12 


1 


2 

|i  a 


f(x)  = constant 

Describes  distribution  of 
time  between  the  succes- 
sive events  in  a Poisson 
distribution 
Gaussian  distribution 


Waiting  for  a bus 

Time  between  emission 
of  particles  in  radioactive 
decay. 

Many  experimental  situa- 
tions 


Standard  normal 


0 


1 


A special  case  of  the  Many  experimental  situa- 

normal  distribution  tions 


Chi-square  k 

At) 

A 'exp(-*) 

/(*)  = f2-  ;x>o 

e-T  * 


2k  Distribution  of  a sum  of  Statistical  tests  on 

squares  of  independent  assumed  normal  distri- 

standard  normal  variables,  bution. 
k is  referred  to  as  “degrees 
of  freedom” 


1.1.3 

Normal  Distributions 

The  normal  distribution  was  proposed  by  the  German  mathematician  Gauss.  This 
distribution  is  applied  when  analyzing  experimental  data  and  when  estimating  ran- 
dom errors,  and  it  is  known  as  Gauss’  distribution.  The  most  widely  used  of  all  con- 
tinuous distributions  is  the  normal  distribution,  for  the  fol owing  reasons: 

• many  random  variables  that  appear  during  an  experiment  have  normal  distri- 
butions; 

• large  numbers  of  random  variables  have  approximately  normal  distributions; 
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• if  a random  variable  have  not  does  a normal  distribution,  not  even  approxi- 
mately, it  can  then  be  transformed  into  a normal  random  variable  by  relative- 
ly simple  mathematical  transformations; 

• certain  complex  distributions  can  be  approximated  by  normal  distribution 
(binomial  distribution); 

• certain  random  variables  that  serve  for  verification  of  statistical  tests  have 
normal  distributions. 

Gauss  assumed  that  any  errors  in  experimental  observations  were  due  to  a large  num- 
ber of  independent  causes,  each  of  which  produced  a small  disturbance.  Under  this 
assumption  the  well-known  bell-shaped  curve  has  been  obtained.  Although  it  ade- 
quately describes  many  real  situations  involving  experimental  observations,  there  is  no 
reason  to  assume  that  actual  experimental  observations  necessarily  conform  to  the 
Gaussian  model.  For  example,  Maxwell  used  a related  model  in  deriving  a distribution 
function  for  molecular  velocities  in  a gas;  but  the  result  is  only  a very  rough  approxima- 
tion of  the  behavior  of  real  gases.  Error  in  experimental  measurement  due  to  com- 
bined effects  of  a large  number  of  small,  independent  disturbances  is  the  primary 
assumption  involved  in  the  model  on  which  the  normal  distribution  is  based.  This 
assumption  leads  to  the  experimental  form  of  the  normal  distribution. 

The  assumption  of  a normal  distribution  is  frequently  and  often  indiscriminately 
made  in  experimental  work,  because  it  is  a convenient  distribution  on  which  many 
statistical  procedures  are  based.  Many  experimental  situations,  subject  to  random 
error,  yield  data  that  can  be  adequately  described  by  the  normal  distribution,  but  this 
is  not  always  the  case. 

The  terms  p and  a are  initially  defined  simply  as  parameters  in  the  normal  distri- 
bution function.  The  term  [x  determines  the  value  on  which  the  bell-shaped  curve  is 
centered  and  a determines  the  “spread”  in  the  curve  Fig.  1.3. 

A large  variance  gives  a broad,  flat  curve,  while  a small  variance  yields  a tall,  nar- 
row curve  with  most  probabilities  concentrated  on  values  near  q. 

The  mean  or  expected  value  of  the  normal  distribution  is  obtained  by  applying 
Eq.  (1.28): 


Figure  1.3:  How  the  varaince  affects  normal  distribution  curve. 
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+oo 


E(X)  = 


a\^2n 


f Xexp 


1 (X-\i 

2 


dx 


(1.37) 


Integration  of  Eq.  (1.37)  is  tedious,  and  is  most  easily  accomplished  by  using  a 
table  of  definite  integrals.  To  integrate,  we  define  a new  variable: 


X — LI 

y = — — =>  x = ay  + it:  dx  = ody 
a 

Since  |i  and  a are  constant,  Eq.  (1.37)  becomes: 


E(X)  = 


ct\/2jt 


V 


hoo  / 2 \ +00  / 2 \ 

J o2yexp dy+(to  J expl-Mdy 


(1.38) 


(1.39) 


We  find  that  the  first  integral  is  null.  For  the  second  integral,  we  get  from  a table 
of  definite  integrals: 


exp 


— ] dy  = v/2jl  =>  E(X)  = p 


(1.40) 


A similar  analysis  shows  that  the  variance  of  X is  simply  Ox ■ The  standard  normal 
distribution  is  obtained  by  defining  a standardized  variable  z: 


X — LI 

z = — — =>  x = oz  + u:  dx  = adz 
a 


So  that  the  probability  density  function  becomes: 

2 


(1.41) 


(1.42) 


In  this  case,  = 0;  az  = 1.  The  graph  of  standard  normal  distribution  is  illus- 
trated in  Fig.  1.4,  and  a brief  tabulation  given  in  Table  1.3  from  Table  B. 

About  68%  of  the  area  under  the  curve  lies  between  z = 1 and  z = +1  (one  stan- 
dard deviation  on  either  side  of  the  mean).  Statistically,  from  Eq.  (1.28): 

+i  . +t  / 2\ 

P{- 1 < z < 1)  = J f(z)dz  = f exp  f - y j dz 

The  integral  is  evaluated  from  Table  1.3: 


P(- 1 < z < 1)  = P(z  < 1)  - P(z  < -1)  = 0.8413  - 0.1587  = 0.6816 
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Figure  1.4  Standard  normal  distribution  curve 


p(ZS")  = IvSexp(-^jd' 


Table  1.3  Abbreviated  table  of  standard  normal  distribution 


z 

0.0 

0.1 

0.2 

0.3 

0.4 

0.5 

0.9 

-3.0 

0.0013 

-2.0 

0.0228 

0.179 

0.0139 

0.0107 

0.0082 

0.0062 

0.0019 

-1.0 

0.1587 

0.1357 

0.1151 

0.0968 

0.0808 

0.0668 

0.0287 

-0.0 

0.500 

0.4602 

0.4207 

0.3821 

0.3446 

0.3085 

0.1841 

+0.0 

0.500 

0.5398 

0.5793 

0.6179 

0.6554 

0.6915 

0.8159 

+1.0 

0.8413 

0.8643 

0.8849 

0.9032 

0.9192 

0.9332 

0.9713 

+2.0 

0.9772 

0.9821 

0.9893 

0.9893 

0.9918 

0.9938 

0.9981 

+3.0 

0.9987 
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Example  1.6 

2 

For  example,  suppose  we  have  a normal  population  with  p = 6;  a = 4,  and  we  want 
to  know  what  percentage  of  the  population  has  values  greater  than  9? 

9-6 

By  Eq.  (1.41),  z = — — = 1.5.  From  Table  1.3  P(z  < 1.5)  = 0.9332,  and  P(z  > 1.5)  = 

1-P(z  < 1.5)  = 1-0.9332  = 0.0668.  Hence  about  6.68%  of  the  population  has  values 
greater  than  9. 


Problem  1.2  4 


Using  Table  1.3  i.e.  Table  B for  standard  normal  distribution,  deter- 
mine probabilities  that  correspond  to  the  following  Z intervals. 


a)  0 < Z < 1.4  ; 

d)  0.75  < Z < 1.96; 


b)  -0.78  < Z < 0 ; 

e)  -o»Z  < 0.44; 
h)-1.96  < Z < 1.96; 


c)  -0.24  < Z < 1.9; 

f)  -ooZ  < 1.2; 
i)  -2.58  < Z < 2.58 


Approximations  to  discrete  distribution 

It  has  already  been  mentioned  that  certain  distributions  can  be  approximated  to  a 
normal  one.  As  the  size  of  the  sample  increases,  the  binomial  distribution  asympto- 
tically approaches  a normal  distribution.  This  is  a useful  approximation  for  large 
samples. 


Example  1.7 

For  example,  suppose  that  we  wish  to  know  the  probability  of  obtaining  40  or  less 
heads  in  100  tosses  of  a coin.  From  Eq.  (1.24),  we  sum  all  40  values  of  P: 

40 

F(X<40)  = £(r)(0.5)“ 

k=0 

This  expression  would  be  very  tedious  to  evaluate,  so  we  use  the  normal  approxi- 
mation. By  Eq.  (1.26)  we  get; 

[i  = 100  x 0.5  = 50  ; o2  = 100  x 0.5  (1-0.5)  = 25 

Then,  by  Eq.  (1.41)  it  follows: 

Z = 4°55°  = -2.0;  P(z  < -2)  = 1 - P(z  < 2)  = 1 - 0.9772  = 0.0228 

so  that:  P(X<40)  = 0.0228  or  2.28%.  For  small  samples,  the  approximation  is 
improved  by  adding  0.5  to  the  X in  Eq.  (1.41): 
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Problem  1.3 

Determine  zi  if  the  probability  is  that  Z is  between: 


a)  P(—oo  < Z < Zj)  = 0.92 

b)  P(  — 1.6  < Z < Zi)  = 0.03 


Problem  1.4 

A sample  of  36  observations  was  drawn  from  a normally  distributed 
population  having  a mean  of  20  and  variance  of  9.  What  portion  of 
the  population  can  be  expected  to  have  values  greater  than  26? 


Problem  1 .5  [4] 

The  average  particulate  concentration  in  micrograms  per  cubic 
meter  for  a station  in  a petrochemical  complex  was  measured  every 
6 hours  for  30  days.  The  resulting  data  are  given  in  table. 


5 

7 

9 

12 

13 

16 

17 

19 

23 

24 

41 

18 

24 

6 

10 

16 

14 

23 

19 

8 

20 

26 

15 

6 

11 

16 

12 

22 

9 

8 

15 

18 

13 

7 

13 

14 

8 

17 

19 

11 

21 

9 

55 

72 

23 

24 

12 

220 

25 

13 

8 

9 

20 

61 

48 

565 

65 

10 

43 

20 

45 

27 

20 

72 

12 

115 

130 

82 

55 

26 

52 

34 

66 

112 

40 

34 

89 

85 

95 

28 

110 

16 

19 

61 

67 

45 

34 

32 

103 

72 

67 

30 

21 

122 

42 

125 

50 

57 

56 

25 

15 

46 

30 

35 

40 

16 

53 

65 

78 

98 

80 

65 

84 

91 

71 

78 

58 

26 

48 
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A new  air  pollution  regulation  requires  that  the  total  particulate 
concentration  be  kept  below  70+5  mg  m3. 

a)  What  is  the  probability  that  the  particulate  concentration  on  any  day  will  fall 
within  this  allowed  range? 

b)  What  is  the  probability  of  exceeding  the  upper  limit? 

c)  What  is  the  probability  of  operating  in  the  absolutely  safe  region  below  65 
mg/m3? 


Problem  1.6: 

A random  variable  has  a normal  distribution  with  the  following 
parameters:  p = 8;  cr  = 4.  Find: 


a)  P(5<X<10);  b)  P(10<X<15);  c)  P(X>15);  d)  P(X<5)? 


Problem  1.7 

Let  us  suppose  that  the  body  weights  of  800  students  have  a normal 
distribution  with  mean  p = 66  kg  and  standard  deviation  a = 5 kg. 
Find  the  number  of  students  whose  weight  is: 


a)  between  65  and  75  kg; 

b)  over  72  kg. 
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Problem  1.8 

A machine  is  mounted  for  production  of  metal  rods  24  cm  long, 
with  a tolerance  rate  of  £ = 0.05  cm.  Based  on  long-time  observation 
it  is  established  that  a = 0.03.  On  the  assumption  that  lengths  of  X 
metal  rods  have  a normal  distribution  calculate  the  percentage  of 
metal  rods  that  will  be  placed  in  tolerance  range.  How  large  should 
this  tolerance  be  so  that  95  per  cent  of  produced  metal  rods  should 
be  within  these  tolerance  limits? 


Problem  1.9 

In  the  lab  mixer  20  batches  of  a composite  rocket  propellant  were 
mixed  under  identical  conditions,  all  of  them  having  the  same  com- 
position. Test  strands  were  taken  out  of  the  obtained  propellant  and 
their  burning  rates  were  measured  at  70  bar  of  pressure.  The  burn- 
ing rate  average  is  X=8.5  mm/s  , and  the  calculated  variance  is 
a =0.30.  What  number  of  strands  has  the  burning  rate: 


a)  between  8 and  9 mm/s; 

b)  over  9 mm/s; 

c)  below  8 mm/s. 


1.2 

Statistical  Inference 

After  gathering  a set  of  experimental  data,  we  usually  wish  to  use  it  to  draw  a con- 
clusion about  the  underlying  population.  For  example,  from  data  on  the  yield  of  a 
chemical  reactor,  we  may  want  to: 

• Decide  whether  the  average  yield  from  several  runs  at  constant  operating 
conditions  equals  that  required  by  economic  factors; 

• Determine  whether  one  set  of  operating  conditions  gives  a significantly  high- 
er yield  than  another; 

• Estimate  the  average  yield  to  be  expected  in  further  runs  at  specified  operat- 
ing conditions; 

• Find  a quantitative  equation  that  can  be  used  to  predict  the  yield  at  various 
operating  conditions. 

We  can  use  various  methods  of  statistical  inference  to  arrive  at  these  conclusions. 
Because  the  original  data  are  subject  to  experimental  error  and  may  not  exactly  fit 
our  presumed  model,  we  can  draw  a conclusion  only  within  specified  limits  of  cer- 
tainty. We  can  never  make  completely  unequivocal  inferences  about  a population  by 
using  statistical  procedures. 

Statistical  inference  may  be  devided  into  two  broad  categories: 

• hypothesis  testing; 

• statistical  estimation. 
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In  the  first  case,  we  set  up  a hypothesis  about  the  population  and  then  either 
accept  or  reject  it  by  a test  using  sample  data.  The  first  two  examples  in  the  first 
paragraph  above  involve  hypothesis  testing.  In  the  first  example,  the  hypothesis 
would  be  that  the  average  yield  equals  the  required  yield. 

Estimation  involves  the  calculation  of  numerical  values  for  the  various  population 
parameters  (mean,  variance,  and  so  on).  These  numerical  values  are  only  estimates 
of  the  actual  parameters,  but  statistical  procedures  permit  us  to  establish  the  accu- 
racy of  the  estimate. 

1.2.1 

Statistical  Hypotheses 

A statistical  hypothesis  is  simply  a statement  concerning  the  probability  distribution 
of  a random  variable.  Once  the  hypothesis  is  stated,  statistical  procedures  are  used 
to  test  it,  so  that  it  may  be  accepted  or  rejected.  Before  the  hypothesis  is  formulated, 
it  is  almost  always  necessary  to  choose  a model  that  we  assume  adequately  describes 
the  underlying  population.  The  choice  of  a model  requires  the  specification  of  the 
probability  distribution  of  the  population  parameters  of  interest  to  us.  When  a statis- 
tical hypothesis  is  set  up,  then  the  corresponding  statistical  procedure  is  used  to 
establish  whether  the  proposed  hypothesis  should  be  accepted  or  rejected.  Generally 
speaking,  we  are  not  able  to  answer  the  question  whether  a statistical  hypothesis  is 
right  or  wrong.  If  the  information  from  the  sample  taken  supports  the  hypothesis, 
we  do  not  reject  it.  However,  if  those  data  do  not  back  the  statistical  hypothesis  set 
up,  we  reject  it. 

In  principle,  two  hypotheses  are  set  up: 

• primary  or  null  hypothesis  H0 ; 

• alternative  hypothesis  Hx 

If  we  accept  the  null  hypothesis  H0  we  automatically  reject  the  alternative  hypoth- 
esis Hi. 

A large  number  of  statistical  hypotheses  are  of  the  kind  that  test  specific  or  range 
values  of  one  or  more  distribution  parameters.  Such  hypotheses  are  tested  by  using 
the  properties  of  sample  data.  As  simple  drawing  of  a sample  from  a population 
does  not  have  to  mean  that  we  obtained  a completely  representative  sample,  we  are 
likely  apt  to  make  certain  errors  even  when  accepting  or  rejecting  a hypothesis. 

Types  of  errors 

When  testing  statistical  hypotheses,  two  types  of  error  may  be  defined,  together  with 
their  probability  of  occurrence. 

• Type  I error:  Rejecting  H0  when  it  is  true.  Let  a equal  probability  of  rejecting 
H0  when  it  is  true.  This  term  is  also  referred  to  as  the  “level  of  significance” 
of  the  test. 

• Type  II  error:  Accepting  H0  when  it  is  false  (that  is,  when  H:  is  true).  Let  (3 
equal  the  probability  of  accepting  H0  when  it  is  false. 
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One  can  generally  say  that  a and  |3  are  risks  of  accepting  false  hypotheses.  Ideally 
we  would  prefer  a test  that  minimized  both  types  of  errors.  Unfortunately,  as  a 
decreases,  P tends  to  increase,  and  vice  versa.  Apart  from  the  terms  mentioned  we 
should  introduce  the  new  term  power  of  a test.  The  power  of  a test  is  defined  as  the 
probability  of  rejecting  H0  when  it  is  false.  Symbolically  it  is:  power  of  a test=  1-P  or 
probability  of  making  a correct  decision. 

The  test  statistic 

To  make  a test  of  the  hypothesis,  sample  data  are  used  to  calculate  a test  statistic. 
Depending  upon  the  value  of  the  test  statistic,  the  primary  hypothesis  H0  is 
accepted  or  rejected.  The  critical  region  is  defined  as  the  range  of  values  of  the  test 
statistic  that  requires  a rejection  H0.  The  test  statistic  is  determined  by  the  specific 
probability  distribution  and  by  the  parameter  selected  for  testing. 

Procedure  for  testing  a hypothesis 

The  general  procedure  for  testing  a statistical  hypothesis  is: 

1.  Choose  a probability  model  and  a random  variable  associated  with  it.  This 
choice  may  be  based  on  previous  experience  or  intuition. 

2.  Formulate  H0  and  Hx . These  must  be  carefully  formulated  to  permit  a mean- 
ingful conclusion. 

3.  Specify  the  test  statistic. 

4.  Choose  a level  of  significance  a for  the  test. 

5.  Determine  the  distribution  of  the  test  statistic  and  the  critical  region  for  the 
test  statistic. 

6.  Calculate  the  value  of  the  test  statistic  from  a random  sample  of  data. 

7.  Accept  or  reject  H0  by  comparing  the  calculated  value  of  the  test  statistic  with 
the  critical  region. 

The  following  examples  illustrate  the  procedure  for  a statistical  test  [7].  In  the 
first,  we  consider  a very  simple  test  on  a single  observation.  The  second  applies  the 
seven-step  procedure  to  a test  on  the  mean  of  a binomial  population  using  a normal 
approximation.  Here,  and  in  the  third  example,  we  introduce  the  idea  of  one-sided 
and  two-sided  tests,  while  in  the  fourth  example  we  illustrate  the  calculation  of  Type 
II  error,  and  the  power  function  of  a test. 

Example  1.8 

A single  observation  is  taken  on  a population  that  is  believed  to  be  normally  distrib- 
uted with  a mean  of  10  and  a variance  of  9.  The  observation  is  X=16.  Can  we  con- 
clude that  the  observation  is  from  the  presumed  population? 

To  answer  this  question,  we  follow  the  seven-step  procedure: 

1.  The  probability  model  is  a normal  distribution:  ji=10;  o2=9.  The  random  vari- 
able is  the  value  of  X. 
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2.  The  primary  hypothesis  H0:X=16  , is  from  a population  that  is  normally  dis- 
tributed with  N (10;  9). 

The  alternate  hypothesis  Hj : X=16,  is  not  from  the  presumed  population. 

3.  Since  we  have  only  one  observation  for  X,  the  test  statistic  is  simply  its  stan- 
dardized value,  Z = (X  — \i)/ox.  The  standard  normal  tables  give  the  distri- 
bution of  this  statistic. 

4.  Choice  of  the  level  of  significance  a:  is  arbitrary.  We  will  use  a=0.01  and 
a=0.05  because  one  of  these  values  is  commonly  used. 

5.  The  test  statistic  is  distributed  normally  with  p=10  and  ax=9,  if  H0  is  true.  A 
value  of  X that  is  too  far  above  or  below  the  mean  should  be  rejected,  so  we 
select  a critical  region  at  each  end  of  the  normal  distribution.  As  illustrated  in 
Fig.  1.5,  a fraction  0.025  of  the  total  area  under  the  curve  is  cut  off  at  each 
end  for  a=0.05.  From  the  tables  (Table  B),  we  determine  that  the  limits  corre- 
sponding to  these  areas  are  Z=-1.96  and  Z=1.96;  so  that  if  our  single  observa- 
tion falls  between  these  values,  we  accept  H0.  The  corresponding  values  for 
a=0.01  are  Z=±2.58. 

6.  Find  the  value  for: 


7. 


v _ 16-10 

V9 


= 2.0 


Since  2>1.96,  we  reject  H0  at  a=0.05.  But  since  2<2.58,  we  accept  H0  at 
a=0.01. 


Figure  1.5  Critical  region  for  two-sided  test 


If  we  are  willing  to  risk  rejecting  H0  when  it  is  true  five  times  out  of  100,  we  can 
reject  it  here;  but  if  we  wish  to  reduce  the  risk  of  rejecting  a true  hypothesis  to  one 
chance  out  of  100,  then  we  must  accept  H0  for  the  example.  Normally,  we  do  not 
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carry  along  two  values  for  a.  A single  desired  level  of  significance  is  chosen  initially, 
and  the  decision  is  based  on  it.  This  almost  trivial  example  illustrates  the  test  proce- 
dure and  demonstrates  the  meaning  of  the  level  of  significance  of  a test. 

Example  1.9 

Experience  in  tire  manufacture  at  a given  plant  shows  that  an  average  of  4.8%  of  the 
tires  are  rejected  as  imperfect.  In  a recent  day  of  operation,  60  out  of  1000  tires  were 
rejected.  Is  there  any  reason  to  believe  that  the  manufacturing  process  is  function- 
ing improperly? 

This  problem  can  be  reduced  to  a test  on  the  sample  value.  We  may  use  the  data 
on  long-range  operation  to  estimate  the  population  parameter  (p=0.048),  since  we 
have  no  other  way  of  knowing  it.  We  again  follow  the  seven-step  method: 

1.  This  is  a binomial  distribution  with  two  outcomes:  acceptance  or  rejection. 
However,  since  the  sample  size  is  large,  we  may  use  the  normal  approxima- 
tion. Since  p=0.048,  we  know  that  the  population  mean  is:  p^np=1000  x 
0.048=48 

and  the  population  variance  is: 
o2=np  (l-p)=1000  x 0. 048(1-0. 048)=45. 7 

2.  As  the  problem  is  stated,  the  primary  hypothesis  is  not  clearly  shown.  We 
choose  to  compare  the  mean  of  the  population  that  is  presumed  to  underlie 
the  day’s  production  to  the  long-range  population  mean: 

H0:p<48;  H1:|.i>48 

The  question  then  is  whether  the  day’s  rejection  rate  of  60  is  sufficiently  high 
to  reject  the  hypothesis  that  the  rejection  rate  is  48  or  less. 

3.  Since  we  are  using  a normal  approximation,  the  test  statistic  is  simply  the 
standard  normal  variable: 


The  subtraction  of  0.5  from  the  sample  value  improves  the  normal  approxi- 
mation. It  is  called  the  “continuity  correction”.  The  numerator  of  the  equation 
is  the  deviation  of  the  sample  value  from  the  population  mean.  The  denomi- 
nator is  simply  the  standard  deviation  of  the  presumed  population. 

Thus,  Z is  the  number  of  standard  deviations  away  from  the  mean  at  which 
we  find  the  sample  value  X.  Here  we  have  used  the  sample  value  X as  an 
estimate  of  the  population  mean  presumed  to  underlie  the  day’s  production. 

4.  Let  cx=0.05.  That  is,  we  will  risk  rejecting  the  true  hypothesis  that  p=48  is  one 
chance  in  20. 

5.  To  determine  the  critical  region,  we  must  know  the  distribution  of  the  test 
statistic.  In  this  case,  Z is  distributed  as  the  standard  normal  distribution. 
With  H0:p<48  and  a=0.05,  we  determine  that  the  critical  region  will  include 
5%  of  the  area  on  the  high  end  of  the  standard  normal  curve  Fig.  1.6.  The 
Z-value  that  cuts  off  5%  of  the  curve  is  found  to  be  1.645,  from  a table  of 
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normal  distribution  such  as  Table  B.  Therefore,  if  our  calculated  Z is  greater 
then  1.645  we  reject  H0. 

6.  For  this  case,  cr2=45.7;  X=60;  |r^48;  n=1000,  and: 


Z = 


60—0.5 — 48  j yQ 
\/457 


7.  Since  1.70  >1.645,  we  reject  H0  and  calculate  that  a rejection  rate  of  60  out  of 
1000  tires  is  significantly  higher  than  the  4.8  % rejection  rate.  We  might  now 
proceed  to  seek  the  cause  of  this  change  by  checking  the  manufacturing  pro- 
cess. 


Figure  1.6  Critical  region  for  one-sided  test 


Suppose  instead  of  finding  60  imperfect  tires  in  1000,  we  had  found  6 in  a sample 
of  100.  Then  our  solution  gives: 


„ 6— 0.5— 4.8 

x/4^7 


0.327;  a=0.05 


Therefore,  with  a=  0.05  as  before,  we  would  accept  H0.  Rejection  of  6 tires  out  of 
100  is  not  significantly  different  from  the  population  proportion  of  0.048,  but  60  out 
of  1000  is  significantly  different.  This  illustrates  the  effect  of  sample  size  on  statisti- 
cal tests.  A smaller  sample  is  more  influenced  by  random  fluctuations  than  is  a larg- 
er one;  so  that  the  same  proportionate  difference  in  a larger  sample  is  statistically 
more  significant  than  in  a smaller  one. 

This  example  illustrates  a one-sided  test,  that  is  the  critical  region  is  on  one  side  of 
the  probability  distribution  because  of  the  way  the  hypotheses  are  stated. 
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Example  1.10 

Suppose  we  had  chosen  to  test  whether  the  daily  rejection  rate  of  60  out  of  1000  was 
significantly  different  from  the  population  proportion  of  0.048,  rather  than  signifi- 
cantly higher  than  the  population  proportion  as  in  the  previous  example.  In  this 
case,  the  hypotheses  would  be: 

H0:  [.1=48;  Hj : p<48  or  p>48 

This  is  two-sided  test.  The  critical  region  is  split  into  two  parts,  one  rejecting  values 
that  are  too  low  and  the  other  rejecting  values  that  are  too  high.  With  a = 0.05,  each 
critical  region  has  an  area  of  a/2=0.25.  As  shown  in  Example  1.8  Fig.  1.5,  the  corre- 
sponding Z-values  are  Z=±1.96.  We  will  reject  H0  if  the  calculated  Z is  less  than 
-1.96  or  greater  than  1.96.  With  a two-sided  test,  H0  is  accepted  for  60/1000  and 
6/100.  For  this  problem,  we  would  probably  be  more  concerned  with  a rejection  rate 
that  was  too  high,  so  that  the  one-sided  test  would  be  more  appropriate  than  the 
two-sided  one. 


Example  1.11 

Determine  the  power  of  the  test  in  Example  1.10  for  the  alternate  hypothesis  that 
the  mean  is  really  50,  using  the  sample  size  1000. 

In  this  case,  we  choose  a single  value  from  the  alternate  hypothesis  of  Example 
1.10.  From  Hx  : |t>48,  we  select  H, : |i=50. 

The  power  varies  with  the  value  selected  for  the  alternate  hypothesis.  Our  hypoth- 
eses are  then: 


H0:  p=48  ; Hj : |tF=50 

The  two  distributions  based  on  these  hypotheses  are  plotted  in  Fig.  1.7,  where 
both  variances  are  presumed  to  be  45.7  (Example  1.9). 

With  a = 0.05,  for  the  two-sided  test,  Z=1.96  and  X = 48  + 1.96\/45.7  = 61.3. 

This  is  the  lower  limit  of  the  upper  critical  region  as  shown  under  the  H0  curve 
Fig.  1.7.  Similarly,  X = 48  - 1.96%/457  = 34.7  is  the  upper  limit  of  the  lower  critical 
region.  Therefore  if  X lies  between  34.7  and  61.3,  H0  is  accepted  regardless  of 
whether  it  is  true  or  not. 

Here  we  have  omitted  the  continuity  correction.  This  is  permissible  for  large  sam- 
ple sizes.  In  addition,  the  method  used  here  is  only  approximate  because  it  assumes 
the  variance  is  constant  regardless  of  H1 . If  Hx:p=50  were  true  and  H0  thereby 
false,  the  curve  for  Hx  in  Fig.  1.7  would  be  the  correct  one;  but  the  limits  34.7  and 
61,3  still  define  the  region  of  acceptance  of  H0.  Because  H0  would  be  accepted  when 
Hx  is  true  if  X falls  between  (34.7;  61.3),  then  the  area  under  the  curve  Hx  between 
these  limits  is  p,  which  is  the  probability  of  a type  II  error.  The  area  is  labeled  in 
Fig.  1.7.  To  determine  this  area,  we  use  the  original  limits  with  the  curve  for  Hx  to 
determine  the  standardized  limits  of  the  P region. 


Upper  limit : Z = 
Lower  limit : Z = 


61.3-50 
\/45.7 
34.7-50 
,/45  7 


1.67 


-2.26 
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30  34  38  42  45  50  54  58  62  66  70 


Figure  1.7  Evaluation  of  type  II  error 

From  the  normal  tables,  Table  B we  have: 


P(Z<1.67)=0.9525;  P(Z<-2.26)=1-P(Z<2.26)=1-0.9881=0.0119, 


Therefore: 

(3=P(-2.26<Z<1.67)=0.9525-0.01 19=0. 9406. 

The  power  is  therefore  only: 

1-|3=0.0594. 

Suppose  now  that  we  repeat  the  calculation  of  the  power  for  other  specific  values 
of  Hj  and  plot  them  as  shown  in  Fig.  1.8.  Inspection  of  Fig.  1.8  shows  that  the 
power  would  vary  from  a value  of  a where  Hj  :[r=48  to  a value  of  1.0  where  Hj  :±°=. 
Such  calculations  yield  the  power  function  curve,  as  shown  in  Fig.  1.8.  As  expected, 
the  further  (ij  is  removed  from  [i0=48,  the  higher  is  the  probability  of  rejecting  the 
false  hypothesis  H0.  Inspection  of  Fig.  1.7  shows  that  (3  decreases  as  a increases,  so 
that  we  could  obtain  a higher  power  at  the  sacrifice  of  the  level  of  significance.  A 
higher  power  at  the  same  a is  possible  if  a large  sample  size  is  used. 

In  this  part,  we  have  considered  the  fundamentals  of  statistical  tests  and  have 
seen  that  no  test  is  free  from  possible  error.  We  can  reduce  the  probability  of  reject- 
ing a true  hypothesis  only  by  running  a greater  risk  of  accepting  a false  hypothesis. 
We  note  that  a larger  sample  size  reduces  the  probability  of  error. 

A demand  for  a specially  high  level  of  significance  of  at  least  0.9999  is  present  in 
rocket  technology  and  spacecraft  industry.  In  order  to  reach  the  mentioned  level  of 
significance,  there  has  to  be  an  almost  disappearing  level  of  (3  so  that  there  is  no 
chance  of  mounting  a defective  part  into  the  mentioned  crafts. 


30 


I Introduction  to  Statistics  for  Engineers 


Figure  1.8  Power  function  for  Example  1.10 


1.3 

Statistical  Estimation 

Engineers  are  often  faced  with  the  problem  of  using  a set  of  data  to  calculate  quanti- 
ties that  they  hope  will  describe  the  behavior  of  the  process  from  which  the  data 
were  taken.  Because  the  measured  process  variable  may  be  subject  to  random  fluc- 
tuations as  well  as  to  random  errors  of  measurement,  the  engineers  calculated  esti- 
mate is  subject  to  error,  but  how  much?  Here  is  where  the  method  of  statistical  esti- 
mation can  help. 

Statistical  estimation  uses  sample  data  to  obtain  the  best  possible  estimate  of  pop- 
ulation parameters.  The  p value  of  the  Binomial  distribution,  the  p value  in  Poison’s 
distribution,  or  the  p and  o values  in  the  normal  distribution  are  called  parameters. 
Accordingly,  to  stress  it  once  again,  the  part  of  mathematical  statistics  dealing  with 
parameter  distribution  estimate  of  the  probabilities  of  population,  based  on  sample 
statistics,  is  called  estimation  theory.  In  addition,  estimation  furnishes  a quantitative 
measure  of  the  probable  error  involved  in  the  estimate.  As  a result,  the  engineer  not 
only  has  made  the  best  use  of  this  data,  but  he  has  a numerical  estimate  of  the  accu- 
racy of  these  results. 

Estimates  of  two  kinds  can  be  made,  point  estimate  and  interval  estimate. 

Point  estimate  uses  the  sample  data  to  calculate  a single  best  value,  which  esti- 
mates a population  parameter.  The  point  estimate  is  one  number,  a point  on  a 
numeric  axis,  calculated  from  the  sample  and  serving  as  approximation  of  the 
unknown  population  distribution  parameter  value  from  which  the  sample  was  tak- 
en. Such  a point  estimate  alone  gives  no  idea  of  the  error  involved  in  the  estimation. 
If  parameter  estimates  are  expressed  in  ranges  then  they  are  called  interval  estimates. 
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J.  Neuman  calls  these  intervals  confidence  intervals,  for  as  parameter-interval  esti- 
mates, ranges  with  known  confidence  level  are  chosen.  An  interval  estimate  gives  a 
range  of  values  that  can  be  expected  to  include  the  correct  value  with  a certain  speci- 
fied percentage  of  the  time.  This  provides  a measure  of  the  error  involved  in  the  esti- 
mate. The  wider  the  range  of  the  interval  estimate,  the  poorer  the  point  estimate. 

1.3.1 

Point  Estimates 

The  best  point  estimate  depends  upon  the  criteria  by  which  we  judge  the  estimate. 
Statistics  provides  many  possible  ways  to  estimate  a given  population  parameter, 
and  several  properties  of  estimates  have  been  defined  to  help  us  choose  which  is 
best  for  our  purposes. 

Example  1.12  [8] 

Suppose  that  we  have  made  11  runs  on  a pilot-plant  reactor  at  constant  conditions 
and  have  obtained  the  following  values  of  the  percentage  yield  of  desired  product: 
32,  55,  58,  59,  59,  60,  63,  63,  63,  63,  67. 

The  data  fluctuate  because  of  uncontrolled  variables  and  measurement  error.  Sup- 
pose we  want  the  best  single  value  of  the  yield,  where  “best”  means  the  yield  that  we 
can  expect  in  future  runs  at  the  same  conditions.  We  could  calculate  the  sample 
mean  X=58.4,  the  median  m=60  and  the  mode  63.  But  perhaps  the  32%  yield  was  a 
run  involving  some  error  of  which  we  are  unaware.  We  cannot  arbitrarily  drop  the 
run  without  knowing  the  cause  of  the  low  value,  but  the  mean  places  undue  weight 
on  it. 

We  might  use  the  median  (60)  as  the  best  estimate  of  future  yields,  since  the  med- 
ian does  not  weight  the  lowest  value  unduly.  On  the  other  hand,  we  were  able  to 
obtain  63%  yield  4 times  out  of  11.  Perhaps  this  value  (the  mode)  is  the  best  esti- 
mate of  future  operation  at  carefully  controlled  conditions.  Obviously,  statistics  can- 
not make  the  judgments  required  in  this  example.  If  we  can  say  that  the  sample 
mean  (58.4)  is  the  best  estimate  of  the  population  under  certain  conditions,  provid- 
ing the  data  come  from  a random  sample  of  the  population.  If  the  32%  yield  is  as 
likely  to  occur  as  any  other  value,  then  the  mean  is  the  best  estimate. 

Several  properties  of  estimates  help  us  to  determine  which  estimate  is  best  for 
our  purposes.  We  will  consider  three  here: 

• consistency 

• bias, 

• efficiency. 

Consistency 

This  tends  to  give  the  correct  value  of  the  population  parameter  as  the  sample  size  is 
increased.  For  example,  as  n approaches  infinity,  the  sample  mean  X tends  to  the 
population  mean  p so  X is  a consistent  estimate  of  p. 
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Bias 

An  estimate  is  unbiased  if  on  the  average  it  predicts  the  correct  value.  Mathemati- 
cally, an  estimate  is  unbiased  if  its  expected  value  is  equal  to  the  population  para- 
meter that  it  is  estimating.  For  example,  the  sample  mean  X is  an  unbiased  estimate 
of  the  population  mean  p because: 

E(X)=p 

On  the  other  hand,  the  sample  variance  defined  by  Eq.  (1.4)  is  biased: 


E(x-x)2 


Because,  as  shown: 


E 


n— 1 
n 


2 

Ox 


This  means  that  the  condition  has  not  been  fulfilled: 


or  that  this  kind  of  variance  estimation  is  not  unbiased  but  biased.  The  population 
variance  unbias  estimation  is  given  by  the  sample  variance  in  the  following  form: 


E (x -x)2 

i=i 


Random  variable  estimations  have,  apart  from  the  mean,  their  own  variance.  It 
has  been  proved  that  when  choosing  an  estimation  it  is  not  sufficient  to  require  an 
estimation  to  be  consistent  and  biased.  It  is  easy  to  cite  examples  of  different  estima- 
tions for  consistent  and  biased  basic  population  means.  The  criterion  for  a better 
estimation  is:  an  estimation  is  better  the  smaller  dispersion  it  has.  Let  us  assume 
that  we  have  two  consistent  and  biased  estimations  Bj  and  02  for  a population  para- 
meter and  let  us  suppose  that  0!  has  smaller  dispersion  than  02..  Fig.  1.9  presents 
distributions  of  the  given  estimations. 

It  is  clear  from  the  figure  that  both  estimations  have  the  same  mean,  or: 

E(01)=E(02)=  population  parameter 

The  random  variable  values  0X  are  more  centered  around  the  population  para- 
meter than  the  02  ones  (i.e.  estimations).  This  means  that  the  average  error  made  in 
multiple  population  parameter  estimation  by  means  of  0!  will  be  smaller  than  when 
we  do  the  same  for  02.  The  0X  estimation  can  be  said  to  be  more  efficient. 


Efficiency 

This  is  an  important  property  of  an  estimate.  An  efficient  estimate  is  the  one  that 
gives  values  that  are  in  general  closest  to  the  correct  value.  A measure  of  the  spread 
of  values  around  a true  value  is  the  variance.  The  statistical  estimate  that  has  the 
smallest  variance  is  said  to  be  the  efficient  estimate,  and  all  others  are  compared 
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with  it  by  taking  the  ratio  of  the  variances.  For  example,  for  a normally  distributed 
population,  the  mean  has  the  minimum  variance  Ox  jn;  while  the  variance  of  the 
median  is  JtOx  jin.  The  efficiency  of  the  median  as  an  estimate  of  the  location  of 
the  normal  population  is  then: 

4/n  2 

E = ' . =-  = 0.637  or  63.7% 

nox  In  n 

Efficiency  is  usually  a more  important  property  than  unbiasedness.  A statistic 
may  prove  to  be  unbiased  because  large  deviations  on  either  side  of  the  correct  value 
cancel  each  other,  but  it  would  be  highly  inefficient. 

1.3.2 

Interval  Estimates 

As  it  has  been  mentioned,  apart  from  point  estimates  there  exist  the  parameter 
interval  estimates.  No  matter  how  well  the  parameter  estimate  has  been  chosen,  it  is 
only  logical  to  test  the  estimate  deviation  from  its  correct  value,  as  obtained  from  the 
sample.  For  example,  if  in  numerical  analysis  one  obtains  that  the  solution  of  an 
equation  is  approximately  3.24  and  that  ±0.03  is  the  maximal  possible  deviation 
from  the  unknown  correct  solution  of  the  equation,  then  we  are  absolutely  sure  that 
the  range  (3.24-0.03=3.21;  3.24+0.03=3.27)  contains  the  unknown  correct  solution  of 
the  equation.  Therefore  the  problem  of  determining  the  interval  estimate  is  formu- 
lated in  the  following  way: 

Let  the  observed  property  of  X elements  of  a population  have  a distribution  deter- 
mined by  density  function  f(X).  Let  us  randomly  draw  from  this  population  a sample  of 
n observations  Xj^.—.Xn.  We  determine  two  values  for  the  beforehand  probability 
(1-a)  close  to  1 .0  , so  that  the  unknown  population  parameter  value  is  within  this  range, 
while  the  a probability  is  outside  it.  Interval  limits  are  determined  by  the  given  sample 
X1,X2,...,Xn.  We  say  that  (Zaj2;Zla^2)  is  the  confidence  interval  for  the  population  para- 
meter if  its  correct  value  is  within  the  range  with  beforehand  given  probability  1-a. 
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The  probability  1-a  is  called  the  confidence  coefficient  of  the  confidence  interval. 
The  interval  defined  in  this  way  is  referred  to  as  a confidence  interval,  and  the  ends  of 
the  interval  are  called  confidence  limits.  The  quantity  (1-a)  is  the  confidence  coefficient, 
we  can  write  down: 

P (Za/2  < population  parameter  < Zx— a/2)=  1-a  (1-44) 

The  interval  (Za/2',Z1  _a,2)  is  a random  variable  changed  from  sample  to  sample. 
Some  of  these  intervals  will  contain  the  population  parameter,  others  not.  However 
in  a large  sample,  the  relative  frequency  of  cases  when  the  interval  will  contain  the 
population  parameter  will  be  approximately  1-a.  For  a case  when  it  does  not  have 
the  population  parameter,  the  relative  frequency  does  not  go  over  a. 

If  we,  for  instance,  choose  that  l-a=0.95  we  can  expect  about  95  % of  samples  to 
give  the  confidence  interval  containing  the  population  parameter.  These  0.95  inter- 
vals will  be  called  95  % confidence  intervals. 

By  choosing  l-a=0.99,  we  can  expect  the  confidence  interval  to  contain  the  popu- 
lation parameter  in  some  99  out  of  100  cases.  But,  as  will  be  shown  later,  the  confi- 
dence interval  corresponding  to  the  coefficient  l(a=0.99,  is  greater  than  the  one  in 
the  case  l-a=0.95.  This  increase  in  confidence  interval  is  the  bad  outcome  of  the 
confidence  coefficient  increase.  Which  of  the  1-a  confidence  coefficient  values  to 
choose  in  the  actual  case  depends  on  what  error  risk  is  acceptable. 


Confidence  interval  for  the  mean  the  variance  of  which  is  known 

Let  us  suppose  that  from  a normal  population  the  value  (t  of  which  is  unknown,  the 
variance  Ox  known,  a sample  X!,X2,...,Xn.  was  drawn.  The  confidence  interval  for  p 
should  be  determined. 

Based  on  central  limit  theorem  [3]  the  average  X has  a normal  distribution 
n(p,Ox  j n),  or: 

E(X)  =pt;  al  = a2x/n  (1.45) 

For  testing  the  hypothesis  on  the  population  mean  we  can  use  the  following  sta- 
tistic: 


(1.46) 


For  a two-sided  test,  we  noted  that  when  Z was  less  than  za  ,2  or  greater  than 
Zj  a/2  we  rejected  the  primary  hypothesis  H0.  We  accepted  H0  if  Z lay  between  za/2 
and  Zj_a/2.  The  probability  that  Z lies  between  these  limits  is  therefore  simply  (1-a). 
Stating  this  mathematically,  we  get: 

P(Za/2<Z<Zl-a/2H-a  f1-4?) 

Substituting  for  Z gives: 
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Or,  the  confidence  interval  is: 


hx/2 


X— p 


^Za/2%<^-h<Zl-a/2% 


(1.49) 


Then  we  reverse  the  signs  and  add  X to  each  term  to  give: 


H-«/2i<^<H/2i 


(1.50) 


Since  za,2  =-zl  a,2  we  may  rewrite  Eq.  (1.50)  in  a more  conventional  form: 
ov 


X+z„  <X+z, 

/» 


(1.51) 


Substituting  Eq.  (1.51)  in  Eq.  (1.48)  then  gives: 


p(5'+za/2^<h<X+z1_c(/2^)=l-  a (1.52) 

Equation  (1.52)  gives  an  interval  estimate  of  p.  The  estimate  is  centered  on  X and 
extends  (Z[  alox/sfii)  on  either  side  of  it. 


Example  1.13 

From  the  reactor  data  given  earlier,  32,  55,  58,  59,  59,  60,  63,  63,  63,  63,  67  determine 
_ 2 
an  interval  estimate  of  yield,  presuming  that  the  population  variance  is  Ox=81. 

Earlier  we  found  the  sample  mean  to  be  X=58.4.  For  a=0.05;  Za^2=- 1.96  and 

Zi  a/2=+l-96.  Substituting  the  values  into  Eq.  (1.52)  then  gives: 

1MVn) =0'95  (1-53) 


58.40  - 1.96-y^j  -<  p -<  58.40  + 


P(53.09  -4  p -4  63.71)  = 0.95  (1.54) 

The  interval  defined  in  this  way  is  referred  to  as  a confidence  interval,  and  the 
ends  of  the  interval  are  called  confidence  limits.  The  quantity  (1-a)  is  the  confidence 
coefficient.  We  must  remember  that  X and  hence  the  confidence  limits  are  the  ran- 
dom variables  in  this  statistic;  whereas  p is  a constant.  Thus,  with  continued  sam- 
pling, we  could  obtain  other  sets  of  confidence  limits.  For  example,  suppose  we 
make  another  set  of  11  runs  and  get  yields  of: 

47,  53,  56,  58,  58,  61,  61,  62,  64,  64,  65, 

Then, 

X=59.0, 

and  the  confidence  interval  is: 

P(63.69  -4  p -4  64.31)  = 0.95  (1.55) 

Equation  (1.54)  states  that  95  out  of  100  of  these  calculated  random  intervals  will  con- 
tain p.  Or  less  precisely,  we  can  be  95%  confident  that  the  interval  calculated  actually 
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contains  p.  If  we  want  to  be  more  certain,  we  must  use  a smaller  level  of  significance  a. 
For  example,  with  a=0.01,  the  interval  estimate  of  Eq.  (1.54)  becomes: 

P(51.41  -<  p -<  65.39)  = 0.99  (1.56) 

This  broader  confidence  interval  is  more  likely  to  contain  p.  If  we  want  to  be  abso- 
lutely certain  (P=1.0)  that  the  interval  contains  p,  we  must  write: 

P(— oo  -<  p x oo ) = 1.0  (1-57) 

which  gives  a rather  useless  interval.  The  interval  estimated  in  Eq.  (1.54)  is  some- 
times stated  another  way: 

p=58.40  ±5.31(a=0.05)  (1.58) 


Hypothesis  testing  with  interval  estimates 

The  confidence  interval  may  be  used  to  test  a hypothesis  about  the  population  para- 
meter on  which  the  confidence  interval  is  based.  For  example,  suppose  that  by  using 
the  data  of  Example  1.13  we  wished  to  test: 

H0 : p?=57 

Hj  : p<57  or  p>57 

If  the  confidence  interval  includes  the  hypothesized  value  given  in  H0,  then  we 
accept  H0.  For  the  first  set  of  data,  Eq.  (1.54)  shows  that  we  should  accept  H0  because 
57  lies  in  the  interval: 

P(53.09  -<  p -4  63.71)  = 0.95. 

This  procedure  is  really  equivalent  to  our  earlier  method  of  hypothesis  testing,  as 
an  inspection  of  Eqs.  (1.47)-(1.52)  shows.  In  this  part  and  the  previous  one,  we  have 
outlined  the  principles  of  statistical  tests  and  estimates.  In  several  examples,  we 
have  made  tests  on  the  mean,  assuming  that  the  population  variance  is  known.  This 
is  rarely  the  case  in  experimental  work.  Usually  we  must  use  the  sample  variance, 
which  we  can  calculate  from  the  data.  The  resulting  test  statistic  is  not  distributed 
normally,  as  we  shall  see  in  the  next  part  of  this  chapter. 

Tests  and  estimates  on  the  statistical  mean 

The  mean  is  perhaps  the  most  important  single  parameter  in  many  experimental  si- 
tuations because  it  pin-points  the  basic  location  of  a population. 

In  most  tests  and  estimates  on  the  mean,  it  is  assumed  that  the  observations  in 
the  sample  have  been  drawn  independently  from  a normal  population.  This 
assumption  is  not  as  restrictive  as  it  first  appears  because  of  the  central  limit  theo- 
rem. 

This  theorem,  which  is  of  far-reaching  importance  in  statistics,  states  that: 

• The  sum  of  identically  distributed  independent  random  variables  is  normally 
distributed  for  large  sample  size  regardless  of  the  probability  distribution  of 
the  population  of  the  random  variable, 

• When  n tends  to  be  infinite,  the  random  variable  distribution  X tends  to  have 
normal  distribution. 
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The  consequence  is: 

If  the  random  variables  X1,X2,...,Xn  are  independent  and  have  the  same  probabil- 
ity distribution  where: 

E(Xi)  = [i;  e(ox ) — Ox 

then  the  random  variable  probability  distribution  X = X;/n  tends  to  a normal  dis- 
tribution with  parameters  (t  and  ax/s/n.  As  a consequence  of  the  central  limit  theo- 
rem, many  test  statistics  may  be  assumed  to  be  distributed  normally  providing  that 
the  number  of  observations  n is  large,  and  even  though  we  do  not  know  the  under- 
lying distribution.  The  normal  approximation  to  the  binomial  distribution  that  we 
have  used  earlier  results  from  the  application  of  the  central  limit  theorem,  because 
the  binomial  distribution  is  the  sum  of  independent  Bernoulli  distributions.  How 
large  must  n be?  Strictly  speaking,  the  normal  distribution  is  approached  asymptoti- 
cally as  n approaches  infinity.  Practically,  the  size  of  sample  depends  on  the  preci- 
sion desired. 

For  a binomial  distribution,  the  normal  approximation  can  be  used  with  good 
accuracy  for  sample  sizes  as  low  as  8,  providing  the  binomial  k is  arbitrarily 
increased  by  0.5  in  calculating  the  approximate  normal  statistic.  For  values  of  the 
parameter  p near  0 or  1,  a larger  sample  must  be  used  to  obtain  an  accurate  approx- 
imation. 

One  conclusion  derived  from  the  central  limit  theorem  is  that  for  large  samples 

the  sample  mean  X is  normally  distributed  about  the  population  mean  [t  with  var- 
2 

iance  ax,  even  if  the  population  is  not  normally  distributed.  This  means  that  we  can 
almost  always  presume  that  X is  normally  distributed  when  we  are  trying  to  esti- 
mate or  make  a test  on  p,  providing  we  have  a large  sample. 

We  have  also  seen  that  X is  an  unbiased,  efficient,  consistent  estimate  of  [t,  if  the 
sample  is  from  an  underlying  normal  population.  If  the  underlying  population  devi- 
ates substantially  from  normality,  the  mean  may  not  be  the  efficient  estimate;  and 
some  other  measure  of  location  such  as  the  median  may  be  preferable.  We  have  pre- 
viously illustrated  a simple  test  on  the  mean  with  an  underlying  normal  population 
of  known  variance.  We  shall  review  this  case  briefly,  applying  it  to  tests  between  two 
means,  and  then  proceed  to  tests  where  the  population  variance  is  unknown. 

Tests  and  estimates  with  variance  known 

As  discussed  earlier,  the  simplest  test  on  the  mean  presumes  an  underlying  normal 
population  with  known  variance,  and  establishes  the  hypotheses: 

H()  : |i  < [t0  H0  :[t>po  (1.59) 

where  (to  is  some  preselected  numerical  value.  These  hypotheses  yield  a one-sided 
test,  with  the  critical  region  at  the  upper  side  of  the  normal  distribution.  We  might 
also  formulate  other  hypotheses: 


1= 

IV 

1= 

o 

Hx  : (t  ^ (tQ 

(1.60) 

1= 

II 

1= 

o 

H : jt  ^ (t0  or  (t  ^ [t0 

(1.61) 
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The  choice  among  the  three  sets  depends  upon  what  we  wish  to  test.  If  we  wish 
to  show  that  [t  is  higher  than  p0,  we  use  the  first  test.  If  we  wish  to  test  whether  (t  is 
less  than  p0  we  use  the  second  set.  To  show  that  p is  simply  unequal  to  p0  we  use 
the  third  set  with  a two-sided  critical  region.  All  of  these  sets  use  the  test  statistic: 


Z = 


X-p 

ax/v^ 


(1.62) 


where  Z is  distributed  normally,  with  mean  equal  to  zero  and  variance  of  one,  and 
where  ox  is  known.  As  shown  earlier,  the  interval  estimate  that  is  equivalent  to  this 
test  on  p is: 


P(^+Za/2^<fi<^1_a/2^)=l-a 

This  interval  estimate  is  really  based  on  the  two-sided  test  of  the  third  set  of 
hypotheses  previously  given.  Although  it  is  possible  to  define  “one-sided”  confidence 
intervals  based  on  the  other  two  sets  of  hypotheses  (1.59)  and  (1.60),  such  one-sided 
intervals  are  rarely  used.  By  one-sided,  we  mean  an  interval  estimate  that  extends 
from  plus  or  minus  infinity  to  a single  random  confidence  limit.  The  one-sided  con- 
fidence interval  may  be  understood  as  the  range  one  limit  of  which  is  the  probability 
level  a and  the  other  one  ±°=. 


Comparison  of  two  means:  variances  known 

When  the  two  variances  are  equal,  we  may  test  whether  two  means  are  equal  by 
using: 


H0  : p1  = p2  H0  : p1  -<  p2  or  p1  >-  p2 


with  the  test  statistic: 

°X\/1/rtl+1/n2 


(1.63) 


(1.64) 


When  H0  is  true,  pt  — p2  = 0,  Eq.  (1.64)  is  simplified.  The  distribution  of  Z is 
normal  for  underlying  normal  populations,  or  for  large  sample  sizes.  If  the  two 
populations  have  different  variances,  the  test  statistic  is: 


Xt  X2  ^ p2^ 

i /ni+°x2 1 n2 


(1.65) 


where  Oxj  andox2  are  the  known  variances 


Ui  LJLJLV.  I 


Tests  and  estimates  with  variance  unknown 

Usually  when  we  have  collected  some  data  and  wish  to  use  them  for  tests  or  estima- 
tions, we  have  no  idea  of  the  numerical  value  of  the  population  variance.  As  a result, 
the  tests  requiring  known  variance  cannot  be  used.  Instead,  we  calculate  the  sample 
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variance  and  use  it  in  place  of  the  population  variance.  For  example,  Eq.  (1.62) 
becomes: 


T = 


Sx/vn 


(1.66) 


The  new  statistic  “t”  is  usually  referred  to  as  “student’s"  t-distribution  Table  C,  after 
W.S.  Gosset,  who  first  worked  out  its  distribution.  For  a normal  population: 


*^o)/ 

\°x/Vn) 

J(') 

’(n-1)  4’ 

y \n— 1/ 

2 

(1.67) 


2/2 

where  the  numerator  is  the  standard  normal  variable  Z and  where  (n  — 1 )Sx  / Oxhas  a 
chi-square  distribution.  The  quantity  (n-1)  is  called  the  “degrees  offreedonf. 


Table  1.4  Selected  values  for  t-distribution 


Degrees  of  freedom 
(n-1) 

*0.90 

*0.95 

*0.975 

*0.99 

*0.995 

1 

3.08 

6.31 

12.7 

31.8 

63.7 

2 

1.89 

2.92 

4.30 

6.96 

9.92 

3 

1.64 

2.35 

3.18 

4.54 

5.84 

4 

1.53 

2.13 

2.78 

3.75 

4.60 

5 

1.48 

2.01 

2.57 

3.36 

4.03 

6 

1.44 

1.94 

2.45 

3.14 

3.71 

7 

1.42 

1.90 

2.36 

3.00 

3.50 

8 

1.40 

1.86 

2.31 

2.90 

3.36 

9 

1.38 

1.83 

2.26 

2.82 

3.25 

10 

1.37 

1.81 

2.23 

2.76 

3.17 

15 

1.34 

1.75 

2.13 

2.60 

2.95 

20 

1.32 

1.72 

2.09 

2.53 

2.84 

25 

1.32 

1.71 

2.06 

2.48 

2.79 

30 

1.31 

1.70 

2.04 

2.46 

2.75 

oo 

1.28 

1.64 

1.96 

2.33 

2.58 

Remark:  For  t values  at  lower  values  of  ; ta  = - 
Hence,  t0.05=  - to.95  = - 2.35  for  (n  - 1)=3 

A few  values  of  the  t-distribution  are  given  in  an  accompanying  table.  We  note 

that  t values  are  considerably  higher  than  corresponding  standard  normal  values  for 

small  sample  size;  but  as  n increases,  the  t-distribution  asymptotically  approaches 

the  standard  normal  distribution.  Even  at  a sample  size  as  small  as  30,  the  deviation 

from  normality  is  small,  so  that  it  is  possible  to  use  the  standard  normal  distribution 

for  sample  sizes  larger  than  30  (n>30)  and  in  most  cases,  for  n<30  t-distribution  is 

2 2 

used.  This  is  equivalent  to  assuming  that  Sx  is  an  exact  estimate  of  Ox  at  large  sam- 
ple sizes  (n>30). 
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We  can  also  evaluate  tests  and  estimates  on  the  mean  with  variance  unknown.  If 
we  calculate  Sx  from  data,  the  test  statistic  equivalent  to  Eq.  (1.62)  is: 


T = 


Sx/v« 


(1.68) 


and  the  confidence  interval  is: 

p (x  + ^ V ^ + h-a/2Sx/Vn)  =1  — « (1.69) 

where  T is  distributed  as  Student’s  t with  (n-1)  degrees  of  freedom.  The  seven-step 
procedure  may  be  used  with  this  T test,  as  shown  in  the  following  example. 


Example  1.14 

To  test  the  hypothesis  that  p=63,  use  reactor-yield  data: 

32;  55;  58;  59;  59;  60;  63;  63;  63;  63;  67. 

We  first  calculate  the  sample  variance  as  follows: 

s2  = E(X-X)2  = ^ExHEE)2  _ 11 x 38340— 6422  = 
n—  1 n(n— 1)  11(11—1) 

The  seven-step  procedure  follows: 


Sx  = 9.33 


1.  The  underlying  distribution  is  assumed  to  be  normal; 

2.  H0:  [t=63;  Hx:  p<63  or  p>63 

3.  The  test  statistic  is:  T = (x  — / (sx/ y/n\ 

4.  Let  a=0.05 

5.  Test  statistic  T is  distributed  as  Student’s  t with  n-l=10  degrees  of  freedom. 

From  Table  1.4,  we  find  t1_0S/2  = t0  975  = 2.23  ; t0.05/2  = I0  025  = —2.23 

for  the  two-sided  test  and  a=0.05 

6.  Since  X = 58.4  =>  T=  (58.4  - 63.0)/(9.33/v/Tl)  = -1.64 

7.  Accept  H0  since  -2.23<-1.64<2.23 


Example  1.15 

Make  an  interval  estimate  of  the  yield  from  the  reactor  in  Example  1.14  for  a 95% 
confidence  level. 

P (58.4  - 2.23  x 9.33/vTT  -<  p,  + 58.4  + 2.23  x 9,-il/VYi)  = 0.95 
P(52.14  + p + 64.66)  = 0.95 


Comparison  of  means,  variances  unknown 

If  we  wish  to  compare  two  means  with  population  variances  unknown,  we  have  two 
situations.  The  sample  variances  may  be  presumed  equal  or  unequal. 

If  the  two  variances  are  unknown  but  presumed  equal,  we  calculate  a pooled  sam- 
ple variance: 

,2  (n,-l)S21  + (n2-l)S22 

p ni+n2—  2 


(1.70) 
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Then  the  test  statistic  is: 


T = 


— T 


(1.71) 


where  T has  (nj+n2(2)  degrees  of  freedom.  If  the  two  variances  are  presumed 
unequal,  we  use: 


„ X,  -x, 

T*  _ 1 2_ 


? 2 

si  s2 


(1.72) 


where  T is  distributed  with  f degrees  of  freedom,  and: 


/ = 


Si/wx+Sz/wj) 

(■ O'  | (s2/»2)2 

n,  +1  n9  +1 


-2 


(1.73) 


The  interval  estimate  for  the  difference  in  any  two  means  corresponding  to  the 
test  in  Eq.  (1.71)  is: 


1 1 


Xi~X2+  t.2  i “ — h - — < [ij  - H2  -<  Xj  - X2  + t. 


1— a/2 


„ , 1 1 

/ 1 


= 1 — a 

(1.74) 


If  we  cannot  assume  that  the  two  variances  S \ ands\  are  equal,  i.e.  we  cannot  cal- 
culate the  “pooled  variance”,  then  (1-a)  100%  confidence  interval  is  based  on  the 
following  statistic: 


T = 


^1  ^2  \^i  M'- 

Si/rti+S2/rt2 


(1.75) 


where  the  T statistic  has  an  approximate  t-distribution  with  f degrees  of  freedom: 


/ = 


si/«i  +S2/n2) 

(si/"i Y ! (s22/n2)2 

n,  — 1 n-)  — 1 


so  that: 

p [-*/,! -a/2  ^ T ^ f/,i-a/2]  = 1 - a 

tj  j_a/2-is  obtained  from  the  Table  C (Appendix). 

By  replacements  we  get  (1-a)  100%  confidence  interval: 


(1.76) 


(1.77) 


Xl  X2  tj  i_a/2 


s?  s2 


1 + IT  ^ ^ ~ ^ Xj  - x2  + t/4_a/2  W -X  + 


s?  . S2 


= 1 — a 
(1.78) 
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Example  1.16 

Using  95%  confidence  interval,  determine  whether  the  mean  nitric  acid  corrosion 
rate  of  metal  A is  different  from  that  of  metal  B.  The  data  on  10  runs  for  each  metal 
under  identical  conditions  are: 


A:  40;  42;  42;  43;  46;  47;  47;  48;  49;  50; 
B:  39;  41;  41;  44;  45;  45;  46;  47;  48;  48. 
For  metal  A and  B we  have: 

A:  Xx=  45.4;  Si=11.60; 

B:  X2  = 44.4;  S2=9.82; 


The  variances  are  nearly  the  same,  so  we  shall  pool  them: 


2 _ 9x11.60+9x9.82 
Sp~  18 


= 10.71;  Sp  = 3.27 


Substituting  in  Eq.  (1.74),  with  t0  975  = 2.10;  t0  025  = —2.10  for  f=18  degrees  of 
freedom: 


+2.10  x 3.27-1  / — — I — — = 0.95 

V 10  10 

P (—2.7  + [1,  - q2  + 4.07)  = 0.95 


(45.4  - 44.4)  + (-2.10)  x 3.27^/-  + - + ^ - p2  + (45.4  - 44.4) 


Because  the  interval  includes  zero,  we  may  accept  the  hypothesis  H0  that  the 
means  for  each  metal  are  not  different: 

H0:  H0  : = |x  or  H1  : (tj  — p2  = 0 

1.3.3 

Control  Charts 

The  concept  of  a confidence  interval  may  be  used  to  set  up  a statistical  control  chart 
on  the  mean.  Let  us  consider  the  reactor  from  Example  1.14.  Suppose  we  want  to 
use  the  results  of  the  11  runs  to  establish  a procedure  for  operation  of  the  reactor  in 
future  runs. 

Let  us  establish  a criterion  whereby  we  conclude  that  the  reactor  is  not  in  control 
if  the  mean  of  five  measurements  of  yield  is  more  than  three  standard  deviations 
away  from  the  population  mean  (as  determined  by  the  earlier  11  runs).  Then  we  can 
establish  upper  and  lower  control  limits  on  a control  range  X±3o,  outside  of  which 
we  initiate  corrective  action  on  the  reactor  operation. 
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Example  1.17 

Construct  a control  chart  for  the  reactor  from  Example  1.14.  Use  this  chart  to  deter- 
mine whether  the  reactor  is  under  statistical  control  for  the  following  averages  of 
five  measurements  of  yield  taken  at  hourly  intervals: 

Time:  123456789  10 

X,  Yield:  55  60  62  52  45  44  43  44  42  43 

From  the  earlier  data,  we  take  the  sample  parameters  as  our  estimate  of  the  popu- 
lation parameters 

p=X=58.4;  crx=Sx= 9.33; 

From  Eq.  (1.69),  the  upper  control  limit  is: 

58.4  + 3 x 9.33/v/5  = 70.8 

and  the  lower  control  limit  is: 

58.4-  3 x 9.33/^5  = 46.0 

These  control  limits  are  plotted  on  the  control  chart,  along  with  the  hourly  yield 
data  as  shown  in  Fig.  1.10. 


Figure  1.10  Control  chart  for  reactor  operation 


The  reactor  was  within  the  control  limits  until  the  5th  hour,  at  which  time  some 
corrective  action  should  have  been  undertaken  to  restore  the  yield.  As  it  is,  the  yield 
seems  to  have  settled  at  about  43%,  which  is  outside  the  control  range. 

The  choice  of  three  standard  deviations  of  the  control  limits  is  common.  It  is 
equivalent  to  a=0.027.  That  is,  only  27  out  of  1000  average  yields  are  likely  to  fall 
outside  of  the  control  range  because  of  random  fluctuations.  Therefore  we  are  quite 
justified  in  assuming  that  yields  less  than  46%  are  due  to  something  other  than  ran- 
dom fluctuations. 
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1.3.4 

Control  of  Type  II  error-(i 

Numerous  experimental  studies  and  conclusions  are  brought  down  to  testing  the 
null  hypothesis  through  the  obtained  observation  sample.  The  so  far  presented 
method,  i.e.  the  significance  confidence  test,  takes  into  account  only  the  type  I error 
a.  The  new  approach,  in  order  to  completely  fulfill  the  requirements  of  an  experi- 
menter, chooses  the  number  of  sample  observations,  so  that  considerably  larger  dif- 
ferences, useful  for  the  experimenter,  can  almost  always  be  discovered  by  a signifi- 
cance test. 

This  kind  of  an  experiment  is  said  to  have  type  II  error  control,  i.e.  the  error  of 
not  discovering  the  real  deviation  from  the  null  hypothesis.  The  probability  of  mak- 
ing type  II  error  is  marked  p. 

In  experiments  in  which  mean  values  are  compared,  the  number  of  observations 
that  should  be  made  depends,  as  might  be  expected,  on  the  quantities: 
o-the  experimental  error  standard  deviation; 

6-the  size  of  difference  between  the  means  it  is  important  to  detect; 

ct-the  risk  of  asserting  a difference  when  none  exists;  that  is  the  level  of  probability 

at  which  the  significance  test  is  made; 

P-the  risk  of  asserting  no  difference  when  a difference  of  6 exists. 

Obviously,  the  number  of  observations  of  a sample  is  the  function  a,  p and 
D=S/o.  D is  the  standard  difference  expressed  in  standard  deviations.  To  determine 
the  number  of  observations  in  a sample  for  different  comparison  tests  we  use 
Tables  F and  G. 

When  the  standard  deviation  is  known,  the  formula  for  determing  the  number  of 
observations  in  an  experiment  where  two  means  are  compared  is  easy  to  be  formu- 
late. 

The  case  will  be  considered  in  which  it  is  desired  to  compare  the  mean  X of  a 
sample  of  observations  with  a standard  value  p0 . Suppose  X"  is  the  value  that  the 
mean  of  the  sample  must  exceed  for  the  difference  to  be  significant.  In  accordance 
with  Eq.  (1.62)  we  have: 

X*  =\x0+Zao/^/n  (1.79) 

If  X"  is  so  chosen,  there  is  only  a risk  a that  the  sample  mean  X will  exceed  X" 
when  p = pQ.  When  p is  equal  to  pQ  + 6(p  — pQ  = 6),  there  exists  the  risk  p of  not 
accepting  this  assertion  but  accepting  the  incorrect  hypothesis  that  p = p0 . 

This  is  the  risk  that  the  sample  mean  X falls  short  of  X , and  consequently 

X*  = pQ  + 6 — Zpd/y/n  (1.80) 

Subtracting  the  second  Eq.  (1.80)  from  the  first  Eq.  (1.79)  it  will  be  seen  that  both 
equations  are  satisfied  when: 

6=  (za+2p)o/v^  =►  n=(za  + Z^2o2  /b1  (1.81) 


(za  + Z^j 


n = 


2 


(1.82) 
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where  D(=6/o)  is  the  difference  it  is  important  to  detect,  expressed  as  a multiple  of 
the  standard  deviation 

The  experimenter  should  therefore  perform  the  number  of  experiments  indicated 
by  Eq.  (1.82)  and  make  a test  of  significance  at  the  level  a. 

If  the  result  is  significant,  we  should  say  that  a real  difference  has  occurred;  if  it  is 
not  significant,  we  should  say  that  no  difference  as  large  as  6 has  occurred.  The 
chances  of  obtaining  a significant  result  when  no  difference  has  occurred  and  of 
obtaining  a nonsignificant  result  when  a difference  p,  — j.iQ  = 6 has  occurred  will 
thus  be  a and  |3,  respectively. 

Example  1.18  [9] 

Primers  of  pressed  tetryl  are  used  for  initiating  charges  of  explosives.  A factor  affect- 
ing their  initiating  power  is  their  density.  For  a certain  purpose  it  was  desirable  that 
the  density  of  the  primers  should  exceed  1.4  g/cm3.  A scheme  was  required  so  that  a 
decision  whether  to  accept  the  batch  as  satisfactory  or  reject  it  because  the  average 
density  was  too  low  could  be  based  on  the  results  obtained  from  testing  a fairly  small 
randomly  drawn  sample  of  primers  from  the  batch.  The  standard  deviation  was 
known  from  past  experience  to  be  0.03  and  the  mean  density  when  the  presses  were 
operating  normally  was  1.54. 

A random  sample  of  n primers  will  be  drawn  from  the  batch,  the  density  of  each 
primer  measured,  and  the  sample  mean  is  calculated  X.  If  X exceeds  the  value  X" , 
the  batch  will  be  accepted;  if  X falls  short  of  this  value  it  will  be  rejected. 

To  ensure  that  good  batches  are  nearly  always  accepted  and  bad  batches  nearly 
always  rejected  it  was  decided  that  the  following  requirements  should  be  satisfied: 

• If  the  sample  mean  density  Xwas  as  low  as  1.50  there  should  be  a 99% 
chance  of  rejection  (or  a 1%  chance  of  acceptance). 

• If  the  sample  mean  density  assumed  the  value  X=1.54  there  should  be  a 98% 
chance  of  acceptance  (or  a 2%  chance  of  rejection). 

Under  the  given  conditions  the  value  to  be  compared  with  the  sample  mean  is 
1.50,  while  the  significance  level  and  type  II  error  are  a=0.01;  (5=0.02  and  6=1.54- 
1.50=0.04,  respectively. 

Thus:  D=6/o=0.04/0.03=1.33.  The  standardized  normal  distribution  table  gives: 
Za=1.326  and  Zp=2.054.  Using  the  Eq.  (1.82):  n=10.8=ll,  and  the  Eq.  (1.80): 
X =1.521.  This  is  illustrated  for  the  present  example  in  Fig.  1.11. 
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Figure  1.11  Distribution  of  means  for  samples  of  1 1 primers 


1.3.5 

Sequential  Tests 

The  preceding  part  has  indicated  the  basis  on  which  the  experimenter  can  decide 
what  number  of  observations  is  required  to  make  comparative  experiments  conclu- 
sive, on  the  assumption  that  this  number  must  be  decided  before  the  experiment  is 
performed.  When,  as  is  common  in  chemical  and  physical  research,  the  observa- 
tions are  obtained  one  after  another,  it  is  generally  possible  to  adopt  an  alternative 
procedure  in  which,  after  each  observation  is  made,  a simple  statistical  test  is 
applied  to  determine  whether  the  results  obtained  so  far  indicate  a definite  conclu- 
sion from  the  experiment,  or  whether  more  observations  are  needed  to  make  the 
experiment  decisive.  The  experiment  thus  terminates  as  soon  as  a definite  conclu- 
sion can  be  drawn,  and  the  average  number  of  observations  required  in  experiments 
carried  out  in  this  manner  tends  to  be  definitely  less  than  when  the  number  has  to 
be  decided  beforehand.  Consequently  this  sequential  method  of  performing  com- 
parative experiments  has  definite  advantages,  particularly  when  the  observations  are 
expensive  or  time  consuming.  It  is  often  only  half  of  the  number  required  by  non- 
sequential testing,  or  in  cases  of  unexpected  large  effects-differences,  sequential  test- 
ing can  offer  an  answer  after  only  one  or  two  observations.  In  sequential  testing  a 
decision  is  made  after  each  new  observation  based  on  all  previously  obtained  infor- 
mation, aimed  at  asserting  whether: 

• to  accept  the  null  hypothesis  that  no  change  of  importance  has  occurred; 

• to  accept  the  aslternative  hypothesis  that  a real  change  has  occurred; 

• to  continue  taking  observations. 

Sequential  tests  are  best  explained  graphically.  As  each  new  observation  comes  to 
hand,  the  value  of  a function  of  all  observations  recorded  up  to  that  time  is  calcu- 
lated and  plotted  against  the  number  of  observations  on  a chart  such  as  shown  in 
Fig.  1.12. 
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On  the  chart  are  two  boundary  lines,  the  positions  of  which  depend  upon  the 
risks  a and  (3,  of  errors  of  type  I and  type  II,  the  magnitude  of  the  difference  it  is 
important  to  detect,  etc.  The  lines  divide  the  chart  into  three  zones:  (1)  in  which  the 
Null  Hypothesis  is  accepted;  (2)  in  which  the  alternative  hypothesis  is  accepted;  and 
(3)  in  which  there  is  no  decision. 

The  sequential  test  then  consists  in  plotting  the  function  of  the  observations  f(X) 
against  the  number  of  observations  n and  continuing  to  take  observations  so  long  as 
the  plotted  points  f(X)  fall  within  (3).  As  soon  as  point  falls  outside  this  zone,  that  is, 
either  in  zone  (1)  (acceptance  of  the  H0),  or  in  zone  (2)  (acceptance  of  the  Hj),  the 
observations  are  discontinued  and  the  indicated  decision  is  taken. 

In  practice  the  chart  is  often  changed  with  the  boundary  values  being  calculated 
in  advance  for  each  value  of  n.  The  test  is  then  made  by  successive  comparisons  of 
the  value  f(X)  with  the  appropriate  limits. 


Figure  1.12  Sequential  test  chart 

One-sided  sequential  testing-comparison  of  a mean  with  a standard  value 

Suppose  it  is  required  to  test  whether  the  population  mean  p of  a series  of  observa- 
tions is  equal  to  some  standard  value  pQ.  As  before,  6 denotes  the  difference  it  is 
important  to  detect,  o the  standard  deviation,  a the  risk  of  asserting  a significant 
difference  when  none  exists,  and  |3  the  risk  of  asserting  no  significant  difference 
when  the  mean  value  is  really  u=p0+S. 

The  function  f(X)  plotted  for  this  test  is  simply  the  total  T of  the  observations  up 
to  the  time  considered,  and  the  boundaries  are  parallel  straight  lines  with  slope  S, 
cutting  the  axis  of  T at  hj  and  h0..  The  values  h0,  hj  and  S are  given  by  the  equations: 


f(x) 


Number  of  observations 


(1.83) 


a = ln(l  — P)/a;  b = ln(l  — a)/p 

Table  H gives  values  of  a and  b for  commonly  used  values  of  a and  p 


(1.84) 
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Example  1.19  [9] 

Apply  sequential  test  to  Example  1.18  so  as  to  point  out  the  difference  between 
sequential  and  nonsequential  testing.  It  was  assumed  that  o was  constant  and  equal 
to  0.03,  and  the  procedure  was  planned  so  that  there  would  only  be  a small  risk 
(a=0.01)  of  accepting  a bad  batch,  that  is  a batch  with  mean  density  as  low  as 
|x0=1.50  g/cm3,  and  a small  risk  ((3=0.02)  of  rejecting  a good  batch,  that  is  a batch 
with  mean  density  p.  =1.54  g/cm3.  Thus  5=0.04,  0=0.03,  a=0.01,  and  |3=0.02.  It  was 
found  that  for  the  nonsequential  test  n=ll  observations  would  be  required  and  that 
the  test  should  be  made  by  calculating  the  mean  of  a sample  of  eleven,  rejecting  the 
batch  if  the  mean  was  less  than  X"=1.521  g/cm3  and  accepting  it  otherwise. 

Had  it  been  convenient  to  use  a sequential  scheme  then  as  each  primer  was  tested 
the  total  T of  the  observations  to  date  would  be  calculated  and  plotted  on  a chart 
with  suitable  boundary  lines.  It  will  be  found  in  practice  that  if  pQ  is  large  compared 
with  6 these  boundary  lines  will  rise  very  steeply  and  appear  to  be  very  close  togeth- 
er, so  that  the  chart  will  be  difficult  to  use.  Since  the  test  is  to  detect  a difference,  it 
will  not  be  affected  if  a constant  amount  is  subtracted  from  each  observation.  For 
purposes  of  convenience,  therefore,  instead  of  considering  the  actual  density  we 
consider  the  amount  by  which  the  density  exceeds  1.40,  that  is  to  say,  1.40  is  sub- 
tracted from  each  observation.  Then: 

(t0=0.10;  (xx=0.14;  h0=-0.0878;  h1=0.1032;  S=0.12; 

To  construct  the  chart  convenient  values  are  chosen  for  the  two  scales,  making 
the  vertical  axis  the  axis  of  T and  the  horizontal  axis  the  axis  of  n.  Points  are  marked 
off  0.1032  unit  above  zero  and  0.0878  unit  below  zero  on  the  axis  of  T,  and  through 
these  points  lines  are  drawn  that  rise  by  0.12  unit  of  T for  each  unit  increase  in  n.  A 
chart  is  then  obtained  like  that  in  Fig.  1.13. 


Cumulative  total 


Figure  1.13  Sequential  test  in  control 


Suppose,  for  example,  the  first  five  observations  were:  1.551;  1.527;  1.581;  1.517; 
1.547.  Subtracting  1.40,  these  would  be  treated  as:  0.151;  0.127;  0.181;  0.117;  0.147, 
and  the  cumulative  totals:  0.151;  0.278;  0.459;  0.576;  0.723,  would  be  plotted  against 
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the  value  n=l,  2,  3,  4 and  5,  respectively.  As  will  be  seen  from  Fig.  1.13,  the  last  point 
is  in  the  zone  of  acceptance,  and  testing  would  therefore  end  and  the  batch  would 
be  accepted  at  this  stage. 

Alternative  methods  of  presenting  the  results 

In  most  cases  it  is  quicker  to  calculate  limits  T0  and  T]  for  each  value  of  n from  the 
expressions: 

T0  = h0  + ns  ; Tx  = \ + ns  (1.85) 


Table  1.5  Limit  values 


n 

1 

2 

3 

4 

5 

6 

7 

8 

T0 

0.0322 

0.1522 

0.2722 

0.3922 

0.5122 

0.6322 

0.7522 

0.8722 

Tr 

0.2232 

0.3432 

0.4632 

0.5832 

0.7032 

0.8232 

0.9430 

1.0632 

The  taking  of  observations  is  then  continued  so  long  as  T lies  between  T0  and  T,. 
In  the  example  given  here,  assuming  as  before  that  1.40  is  subtracted  from  each 
observation,  these  limits  are  as  shown  in  Table  1.5. 

For  the  particular  set  of  observations  previously  given,  the  successive  cumulative 
totals  T were:  T=0.151;  0.278;  0.459;  0.576  and  0.723.  Testing  would  therefore  be  dis- 
continued at  this  stage  with  acceptance  of  the  batch,  since  for  the  first  time  T falls 
outside  one  of  the  boundaries  denoted  by  T0  and  Tj  (n=5  for  T5=0.723  is  outside  the 
range  0.5122-0.7032). 

Two-sided  sequential  testing 

The  technique  so  far  considered  is  appropriate  for  testing  whether  a mean  value  is 
significantly  greater  than  some  specified  value  when  the  standard  deviation  is  accu- 
rately known.  A precisely  similar  procedure  is  used  to  test  whether  a mean  value  is 
significantly  less  than  the  specified  value.  When,  however,  the  alternative  hypothesis 
is  that  |i  may  depart  from  m,  in  either  direction,  the  test  procedure  will  be  different. 
The  value  6 will  now  be  the  deviation  (positive  or  negative)  from  the  specified  value 
that  it  is  desired  to  detect.  A suitable  test1*  can  be  obtained  by  superimposing  two 
one-sided  tests,  say  A and  B,  in  each  of  which  the  error  of  the  type  I is  set  at  a/2, 
and  6 is  taken  to  be  positive  in  one  test  and  negative  in  the  other. 

The  procedure  consists  of  plotting  the  cumulative  sum  of  the  observations,  taken 
with  u()  as  the  origin,  on  a chart  such  as  Fig.  1.14  on  which  both  sets  of  boundary 
lines  are  shown. 

The  lines  divide  the  chart  into  a number  of  zones,  which  may  be  merged  into 
three  shaded  zones  and  the  unshaded  zone  shown.  In  the  upper  shaded  zone  the 
hypothesis  that  a real  increase  has  occurred  will  be  accepted  (t>p0,  in  the  center 
shaded  zone  the  hypothesis  that  no  important  change  has  occurred  will  be  accepted 
|i=H0 , and  in  the  lower  shaded  zone  the  hypothesis  that  a real  decrease  has  occurred 
will  be  accepted  |x<pQ.  The  logic  of  this  procedure  may  be  seen  by  considering  the 


1 ) This  is  the  two-sided  test  proposed  by  Barnard-a 
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Cumulative  total 


Figure  1.14  Sequential  test  chart:  double-sided  testing 


nature  of  the  two  individual  tests  A and  B.  A tests  the  hypothesis  that  no  increase  of 
importance  has  occurred  (t>p0  against  the  alternative  that  a real  increase  has 
occurred,  and  B tests  the  hypothesis  that  no  decrease  of  importance  has  occurred 
against  the  alternative  that  a real  decrease  has  occurred  p>[t0.  The  boundary  lines 
are  given  by  the  formulas 


T0  = K + ns  1 A T0  = h0  + ns  1 
Tr  = hi  + ns  J Tr  = hx  + ns  J 


(1.86) 


where: 

h0  = —To  a1  j 8 = —h'o 
hx  = a a2  j 8 = —h[ 

S = 8/2  = -S' 


(1.87) 


a = ln(l  — |3)/l/2a 


h'  = ln(l  — a/2)/|3 
6 is  taken  to  be  positive. 

The  calculations  are  simplified  by  the  use  of  Table  H,  the  values  of  a and  To  being 
found  directly  by  entering  the  table  with  the  risk  of  the  error  of  the  type  I equal  to 
1/2  a and  the  risk  of  the  error  of  the  type  II  equal  to  |3. 


Example  1.20  [9] 

In  an  investigation  of  factors  affecting  strength  of  synthetic  fibers  a modification 
was  made  in  the  preparation  of  the  material  and  a series  of  separate  preparations  of 
fiber  was  made  in  pairs,  one  of  the  regular  and  one  of  the  modified  material;  and  a 
number  of  properties  of  the  resulting  fibers  were  determined.  The  results  for  each 
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pair  of  observations  were  known  before  the  next  pair  were  carried  out,  so  that  a 
sequential  test  could  be  employed.  The  most  important  property  measured  was  the 
breaking  load,  and  it  was  known  from  past  experience  that  the  standard  deviation  of 
the  difference  between  repeat  preparations  with  respect  to  this  property  was  approxi- 
mately 10  units.  The  experiment  was  designed  so  that  the  risk  a of  asserting  non- 
existent changes  was  a=0.05  and  so  that  a difference  of  6=±10  would  normally  be 
detected  with  90%  certainty. 

Applying  the  sequential  test  to  the  differences  of  observed  breaking  load  in  repeat 
pairs  of  observations,  then: 

p,0=0;  (^=±10  since  S =±10;  a=0.05;  (3=0.1;  a=10;  h0=-22.8;  h!=35.8;  S =5. 

Based  on  the  mentioned  values  the  boundary  lines  are  drawn: 

T0  = -22.8  + 5n  1 T0  = 22.8  - 5n  1 
Tj  = 35.8  + 5n  J ’ T:  = -35.8  -5 n / ’ 

A graphic  illustration  of  a sequential  test  chart  is  in  Fig.  1.15. 

▼ T 


h-,=35,8 

h0=-22,8 


Figure  1.15  Sequential  testing-synthetic  fiber 


In  one  experiment  the  following  values  were  recorded  for  the  differences  in 
breaking  load  between  synthetic  fibers  prepared  in  two  different  ways:  7;  5;  8;-ll; 
10;  8;-9;  6;-7.  The  cumulative  sums  of  the  observations  are  7;  12;  20;  9;  19;  27;  18;  24; 
17,  and  the  points  are  plotted  in  Fig.  1.15.  The  line  crosses  the  limit  T0  at  the  ninth 
observation,  so  the  hypothesis  that  no  important  change  in  breaking  load  has 
occurred  is  accepted.  Exactly  the  same  procedure  can  be  done  analytically. 
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1.4 

Tests  and  Estimates  on  Statistical  Variance 


After  determining  the  location  of  a set  of  data  by  tests  or  estimates  on  the  mean,  we 
next  check  the  variability  of  the  data.  How  seriously  do  the  data  scatter  about  the 
mean?  If  the  scatter  is  large,  a given  observation  is  less  reliable  than  if  the  scatter  is 
small. 

A measure  of  the  scatter  or  variability  of  data  is  the  variance,  as  discussed  earlier. 
We  have  seen  that  a large  variance  produces  broad-interval  estimates  of  the  mean. 
Conversely,  a small  variability,  as  indicated  by  a small  value  of  variance,  produces 
narrow  interval  estimates  of  the  mean.  In  the  limiting  case,  when  no  random  fluc- 
tuations occur  in  the  data,  we  obtain  exact  identical  measurements  of  the  mean.  In 
this  case,  there  is  no  scatter  of  data  and  the  variance  is  zero,  so  that  the  interval  esti- 
mate reduces  to  an  exact  point  estimate. 

In  practice,  random  fluctuations  in  process  variables  and  random  errors  of  mea- 
surement are  always  present.  If  our  measurements  are  sufficiently  sensitive,  we  will 
pick  up  these  random  fluctuations,  and  the  variance  of  the  measurements  will  not 
be  zero. 

Obviously,  we  need  tests  and  estimates  on  the  variability  of  our  experimental 
data.  We  can  develop  procedures  that  parallel  the  tests  and  estimates  on  the  mean  as 
presented  in  the  previous  section.  We  might  test  to  determine  whether  the  sample 
was  drawn  from  a population  of  a given  variance;  or  we  might  establish  point  or 
interval  estimates  of  the  variance.  We  may  wish  to  compare  two  variances  to  deter- 
mine whether  they  are  equal.  Before  we  proceed  with  these  tests  and  estimates,  we 
must  consider  two  new  probability  distributions.  Statistical  procedures  for  interval 

estimates  of  a variance  are  based  on  chi-square  and  F -distributions.  To  be  more  pre- 

2 2 

cise,  the  interval  estimate  of  a o variance  is  based  on  x -distribution  while  the  esti- 
mate and  testing  of  two  variances  is  part  of  a F-distribution. 


X2-chi-square  distribution 

The  chi-square  distribution  was  discussed  briefly  in  the  earlier  section  on  probability 
distributions.  Suppose  we  have  (k+1)  independent  standard  normal  variables.  We 
then  define  x as  the  sum  of  the  squares  of  these  (k+1)  variables.  It  can  be  shown 
that  the  probability  density  function  of  x is: 

k „ 


K* 


( 2\ 
[x  ) 

2 

exp| 

H/2) 

c-o 

I ! 2k/2 

(1.88) 


2 

Because  x is  the  square  of  standard  normal  variables,  it  has  no  negative  values: 

0 x x -<  oo 

The  distribution  depends  on  the  number  of  independent  variables  included  in  the 
summation.  The  parameter  of  the  % distribution  is  the  degrees  of  freedom  (k), 
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2 

which  is  one  less  than  the  number  of  observations.  A brief  table  of  the  x distribu- 
tion is  given  in  Table  1.6,  and  a few  representative  curves  are  shown  in  the  accompa- 
nying chart  Fig.  1.16: 


Table  1.6  Selected  values  of  the  x2-distribution 


Degrees  of 
freedom  k 

2 

X 0.005 

2 

X o.oi 

2 

X 0.025 

2 

X 0.05 

X20.95 

2 

X 0.975 

X20.99 

2 

X 0.995 

1 

0.000 

0.000 

0.001 

0.004 

3.84 

5.02 

6.63 

9.88 

2 

0.010 

0.020 

0.051 

0.103 

5.99 

7.38 

9.21 

10.6 

3 

0.072 

0.115 

0.216 

0.352 

7.81 

9.35 

11.3 

12.8 

4 

0.207 

0.297 

0.484 

0.711 

9.49 

11.1 

13.3 

14.9 

5 

0.412 

0.554 

0.831 

1.15 

11.1 

12.8 

15.1 

16.7 

6 

0.676 

0.872 

1.24 

1.64 

12.6 

14.4 

16.8 

18.5 

7 

0.989 

1.24 

1.69 

2.17 

14.1 

16.0 

18.5 

20.3 

8 

1.34 

1.65 

2.18 

2.73 

15.5 

17.5 

20.1 

22.0 

9 

1.73 

2.09 

2.70 

3.33 

16.9 

19.0 

21.7 

23.6 

10 

2.16 

2.56 

3.25 

3.94 

18.3 

20.5 

23.2 

25.2 

15 

4.60 

5.23 

6.26 

7.26 

25.0 

27.5 

30.6 

32.8 

20 

7.43 

8.26 

9.59 

10.9 

31.4 

34.2 

37.6 

40.0 

25 

10.5 

11.5 

13.1 

14.6 

37.7 

40.6 

44.3 

46.9 

30 

13.8 

15.0 

16.8 

18.5 

43.8 

47.0 

50.9 

53.7 

04  8 12  16 


Figure  1.16  Chi-square  distributions  are  unsymmetrical 

Based  on  the  value  that  takes  (0;  °°)  and  Fig.  1.16  we  can  see  that  the  unsymmetri- 
cal distribution  is  in  question.  The  x -distribution  is  derived  from  Maxwell’s  distri- 
bution of  molecular  velocities  in  gases  [5]. 
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For  values  k>30  , the  % -distribution  may  be  approximated  from  the  standard  nor- 
mal distribution: 

%lk=l[za  + V 2k^l]2  (1.89) 

where  Za  is  the  equivalent  percentile  of  the  standard  normal  variable. 

The  mean  of  the  x distribution  is  k,  and  the  variance  is  2k.  Because  x is  the 
sum  of  identically  distributed  variables,  its  distribution  is  asymptotically  normal,  as 
shown  by  the  central  limit  theorem.  This  can  be  seen  in  the  accompanying  figure 
for  large  values  of  k.  For  large  values  of  k,  we  can  write: 

Z=(x-kyV2k  (1.90) 

Equation  (1-90)  gives  values  that  are  approximately  distributed  as  the  standard 
2 

normal  variable.  The  x distribution  will  be  used  later  for  tests  on  the  variance, 
because  the  following  statistic  has  the  x distribution  with  lc  degrees  of  freedom  as 
shown  by  Brownlee  [10]. 

X2k=kS2x/ax  (1.91) 


The  F-distribution 

The  F-distribution  is  used  to  compare  the  variances  of  two  populations.  Suppose  we 
calculate  the  sample  variances  Sx  and  S2,  for  two  populations  of  size  nx  and  n2. 
Then  F is  defined  as: 


F = 


(1.92) 


where  F is  distributed  as  the  F-distribution  with  nrl  and  n2-l  degrees  of  freedom.  If 
we  let  kx=nrl  and  k2=n2-l  and  use  the  relations  in  Eq.  (1.91)  we  may  rewrite  Eq. 


(1.92): 


F = 


(1.93) 


2 2 

If  Oi  = o2 , which  is  frequently  the  condition  being  tested,  we  get: 


F = 


(1.94) 


We  see  that  F is  the  ratio  of  two  chi-square  distributions,  each  divided  by  its 
degrees  of  freedom.  The  F-distribution  is  usually  written  as  F(kx,  k2),  denoting  the 
degrees  of  freedom.  It  can  be  easily  shown  that: 

Fa(k1;  k2)=l  / Fa.a  (k2,  kx) 


(1.95) 
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Table  1.7  Selected  values  of  the  F-distribution  for  F{0  95j 


l<2 

Degrees  of  freedom  in  larger  variance,  kn 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

20 

30 

OO 

1 

161 

200 

216 

225 

230 

234 

237 

239 

241 

242 

248 

250 

254 

2 

18.51 

19.0 

19.16 

19.25 

19.30 

19.33 

19.36 

19.37 

19.38 

19.39 

19.44 

19.46 

19.50 

3 

10.13 

9.55 

9.28 

9.12 

9.01 

8.94 

8.88 

8.84 

8.81 

8.78 

8.66 

8.62 

8.53 

4 

7.71 

6.94 

6.59 

6.39 

6.26 

6.16 

6.09 

6.04 

6.00 

5.96 

5.80 

5.74 

5.63 

5 

6.61 

5.79 

5.41 

5.19 

5.05 

4.95 

4.88 

4.82 

4.78 

4.74 

4.56 

4.50 

4.36 

6 

5.99 

5.14 

4.76 

4.53 

4.39 

4.28 

4.21 

4.15 

4.10 

4.06 

3.87 

3.81 

3.67 

7 

5.59 

4.74 

4.35 

4.12 

3.97 

3.87 

3.79 

3.73 

3.68 

3.63 

3.44 

3.38 

3.23 

8 

5.32 

4.46 

4.07 

3.84 

3.69 

3.58 

3.50 

3.44 

3.39 

3.34 

3.15 

3.08 

2.93 

9 

5.12 

4.26 

3.86 

3.63 

3.48 

3.37 

3.29 

3.23 

3.18 

3.13 

2.93 

2.86 

2.71 

10 

4.96 

4.10 

3.71 

3.48 

3.33 

3.22 

3.14 

3.07 

3.02 

2.97 

2.77 

2.70 

2.54 

20 

4.35 

3.49 

3.10 

2.87 

2.71 

2.60 

2.52 

2.45 

2.40 

2.35 

2.12 

2.04 

1.84 

30 

4.17 

3.32 

2.92 

2.69 

2.53 

2.42 

2.34 

2.27 

2.21 

2.16 

1.93 

1.84 

1.62 

OO 

3.84 

2.99 

2.60 

2.37 

2.21 

2.09 

2.01 

1.94 

1.88 

1.83 

1.57 

1.46 

1.00 

Using  the  Table  1.7,  we  note  that  the  following  value  of  F gives: 


WMO) 


l 

W10;6) 


i 

T06 


0.246 


The  F-distribution  is  very  widely  used  in  statistical  procedures.  It  is  the  distribu- 
tion used  in  the  analysis  of  variance,  which  will  be  considered  later.  In  this  section, 
we  use  the  F-distribution  in  tests  of  equality  of  the  variances  of  two  populations. 


Interrelationships  of  several  distributions 

In  this  chapter  of  statistics  for  engineers  we  have  so  far  introduced  four  important 
probability  distributions  used  in  statistical  tests  and  estimates.  These  are: 

• Standard  normal  distribution,  Z; 

• Student’s  t-distribution,  tki 

2 

• CFlI-square  distribution,  % ; 

• F-distribution,  F(lc!,  k2). 

Here  we  summarize  several  relationships  among  these  four  distributions: 

1.  As  the  sample  size  approaches  infinity,  t approaches  the  standard  normal 
variable  for  the  same  a: 

t^oc  = Z (1-96) 

This  is  evident  by  an  inspection  of  the  Tables  B and  C.  For  example: 

^0.975— >oo  = ^0.975  = 1-96 

2.  As  n — w approaches  infinity,  the  ratio  / /k  approaches  1.0.  As  a result: 
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Fa(fe1,oo)  = ^i/k1  (1.97) 

For  example,  an  inspection  of  Tables  1.6  and  1.7  shows  that: 

W5,°o)  = 3£0.9S/5  = H.10/5  = 2.21 

3.  The  chi-square  distribution  is  related  to  the  normal  distribution  by: 

Xi,„  = Zla,  2 (1.98) 

For  example: 

Xi;o.o5  = Zi-o.<>5/2  =►  3.84  = 1.962  = 3.84 

4.  The  F-distribution  is  related  to  the  t-distribution  by: 

fa(l,fc2)  =4;i-«/2  (1-99) 

For  example: 

Wl; 10)  = 4o,975  =►  4.96  = 2.232  = 4.96 

Tests  and  estimates  on  a single  variance 

2 

As  shown  earlier,  the  sample  variance  Sx  is  an  ubiased  estimate  of  the  population 
2 2 
variance  Ox-  If  the  sample  is  from  the  normal  population,  Sx  is  also  the  efficient 
2 2 

estimate;  so  that  Sx  is  usually  the  best  point  estimate  of  Ox.  The  mean  deviation  or 
range  may  also  be  used  to  estimate  the  population  standard  deviation  Ox;  but  these 
estimates  are  biased,  less  efficient  and  inconsistent. 

If  we  wish  to  test  whether  a sample  is  drawn  from  a population  of  a specific 
known  variance,  we  have  a two-sided  test: 

H0  : Ox  = °o 

H i : Ox  ^ °o  or  Ox  "4  °o 

Assuming  H0  is  correct,  the  test  statistic  is  that  given  in  the  next  form: 
X2=kSx/oo  (1.100) 

The  critical  region  is  split  between  the  high  and  low  ends  of  the  distribution: 
feSx/oo  A xa/u  or  kS2x/ol  >-  x\_a/u  (1-101) 

If  we  wish  to  test  whether  the  variance  of  a product  exceeds  a given  value,  we 
have  a one-sided  hypothesis: 

H0  : ox  < Co  Ht  : Ox  >-  Co 

and  a one-sided  critical  region: 

kS2x/oWx  la/2k  (1.102) 

The  confidence  interval  that  is  equivalent  to  the  two-sided  test  is  obtained  from 
the  critical  regions: 

P[k^/xla/2^o2x^kS2x/x2a/u] 


= 1 — a 


(1.103) 
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Example  1.21 

Determine  point  and  interval  estimates  for  the  population  variance  of  the  reactor 
yield  data  of  earlier  Example  1.12: 

Yield  %:  32;  55;  58;  59;  59;  60;  63;  63;  63;  63;  67. 

The  best  point  estimate  is  the  sample  variance,  which  was  already  calculated  as 
Sx=87.05  so  that  Sx=9.33.  Using  the  Eq.  (1.103)  we  get  the  variance  interval  estimate 
for  a=0.05: 

k=n-l=10;  Xo.975  = 20.5;  xlms  = 3 4 5 6 7 *-25- 

P(10  x 87.05/20.5  ^ Ox  ^ 10  x 87.05/3.25)=  0.95 

P(42.26  < 267.85)=  0.95 

The  extremly  wide  range  is  the  consequence  of  the  large  sample  variance. 

Comparison  of  two  variances 

The  F-statistic  may  be  used  to  test  the  quality  of  two  population  variances.  The 
hypotheses  are: 

2 2 2 2 2 2 

H0  : Oi  = 02  H1  : Oi  >-  02  or  Oi  -<:  02  (1.104) 

The  test  statistic  is:  F = s\  j s\  which  has  the  F-distribution  with  kj  and  k2 
degrees  of  freedom. 

Example  1.22 

Determine  whether  the  assumption  of  equal  variances  in  Example  1.16  was  justi- 
fied. In  the  earlier  problem,  we  found: 

Si  =11.60;  n!=10;  S2=9.82;  n2=10. 

The  seven-step  procedure  is  used: 

1.  It  is  necessary  to  assume  that  both  populations  are  normally  distributed.  The 

random  variables  are  Xi  and  X2,  the  corrosion  rate  for  each  metal  A and  B. 
22  2222 

2.  H0  : Oj  = o2  Hj  : ox  >-  o2  or  0^  -<  o2 

3.  The  test  statistic  is:  F = Sx  j S2 

4.  Leta=0.10 

5.  F is  distributed  as  F(9;9),  and  the  critical  region  is: 

P -<  Fo.o5  (9;  9)  and  F >-  F0  95  ( 9 ; 9) 

From  Table  1.7  or  Table  E,  the  value  is:  F0.95(9;9)=3.18; 
Fo.o5(9;9)=l/F0.95(9;9)=l/3. 18=0.314 

6.  For  this  test:  F=ll. 60/9. 82=1. 18 

7.  Since:  0.314<1.18<3.18,  we  accept  H0.  There  is  no  significant  difference  in 

the  variances. 
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The  interval  estimate  for  the  ratio  of  variances  is: 


si  s] 


si  si 


K fci 


-<  -4  -<  ■ 


1 l-a/2  V'T,^  J °2  ‘“/!ri’'2' 


) (^1 ’^2 


= 1 — a 


For  the  previous  example: 

ri2 


0.371  3.6 


= 0.90 


Recapitulation  of  statistic  tests 


Table  1.8  Testing  p when  o2  is  known 


„ . . X — ix 

Statistic:  Z = . f °. 

yj O jr\ 

Null  hypothesis 

Alternative 

Rejection  region  H0 

X 

© 

II 

o 

Hp  p 5t  Po 

Z -<  -Z|_„y2  0r  Z > Z,  -.yyj'1 

o 

VI 

o 

X 

Hp  p > po 

Z >-  -Z-x-a 

H0  : p > p0 

Hp  p < po 

Z -<  — z,_a 

Table  1.9  Testing  p when 

a2  unknown 

„ . . X-p 

Statistic:  T = — 

s/y/a 

Null  hypothesis 

Alternative 

Rejection  region  H0 

H0 : p=p0 

Hp  p * po 

Z >"  tn-l,l-a/2  or  T -<  — tn_i1_ct/r2 

O 

VI 

ZL 

o 

X 

Hp  p > po 

T tn_! 

H0  : p > p0 

Hp  p -<  po 

T -<  — tn-l,l-a 

Table  1.10  Testing  prp2  when  G]2  and  a22 

are  known 

+°;>2 

Null  hypothesis 

Alternative 

Rejection  region  H0 

o 

II 

=L 

1 

=L 

o 

X 

Hi : pi  — p2  ^ 0 

Z ^ l-a/2  or  Z -<  -Zl-a/2 

Hq  : Pi  — P2  — 0 

Hi : pi  - p2  > 0 

Z >-  Zi_a 

H0  : Pi  — p2  — ^ 

Hi : Pi  — p2  -<  0 

Z -<  -z,_„ 

(1.105) 


1.4  Tests  and  Estimates  on  Statistical 


Table  1.11  Testing  when  a/  and  a22  unknown  but  equal 


„ . . X.-X9 

Statistic:  T — , 1 2 ; / = + n2  — 2 

S/>V1/ni  + 1/n2 

Null  hypothesis  Alternative  Rejection  region  H0 


H0  : |Xi  - p2  = 0 Hi  : - \i2  * 0 T < -tf  l^/2  or  T > tf,^ 

H0  : pi  - p2  <0  Hj : pi  - p2  > 0 T > tti_a 

H»  : I1!  - p2  > 0 H | : ii  | - ^ 0 T = — tf  l ,, 


Table  1.12  Testing  the  variance  o2 


Statistic:*2  =^4^; 

A O2 

0 

1 

c 

II 

Ok: 

Null  hypothesis 

Alternative 

Rejection  region  H0 

H„  : o2  = oo 

Hi : o2  * oo 

X2  > Xn-l,l-a/2  or  X2  -<  Xn-l;a/2 

H0  : o2  < Oo 

Hj  : o2  > Go 

X2  > Xn-l.l-a 

H0  : o2  > Oo 

Hj  : o2  -<  Go 

X2  < Xn-l,la 

Table  1.13  Testing  a12=a22 


S2 

Statistic:  F = — ; /y  — - 1 : k2  = n2  — 1 

2 


Null  hypothesis 

Alternative 

Rejection  region  H0 

No 

II 

o 

X 

Hx : o2 

F > Fki,fa,i-a/2  or  F < Fkl  la  a/2 

H0  : a2  < o2 

Hj  : ol  > o2 

F Fklik2  i_a 

H0  : a2  > a2 

Hj  : o?  -<  o2 

F < Fkljfa  a 

Problem  1.10  [4] 

Successive  colorimetric  determinations  of  the  normality  of  a 
K2Cr2C>7  solution  were  as  follows  1.22;  1.23;  1.18;  1.31;  1.25;  1.22; 
1.24  ( x 10  ).  Sample  variance  is  Ox=49  x 10  . Determine  95% 

confidence  interval  of  the  mean. 


Problem  1.11  [4] 

“Frigid-Flow  Co.”  has  received  final  test  results  on  the  company’s 
new  heat  exchanger.  The  values  given  below  are  overall  heat-transfer 
coefficients:  60;  63;  60;  68;  70;  72;  65;  61;  69;  67  BTU/h  ft2  °F.  At  the 
99%  confidence  level,  what  minimum  value  for  the  exchanger’s 
overall  heat-transfer  coefficient  can  the  company  suggest? 


eo 
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Problem  1.12 

Successive  determinations  of  the  opened  bottles  of  HC1  were  found 
to  be,  expressed  in  normalities: 


Bottle  N1  : 15.75;  15.64:  15.92; 
Bottle  N2  : 15.58;  15.49;  15.72; 


The  producer  informs  that  the  last  delivery  has  the  HC1  concentra- 
2 

tion  variance  a =0.016.  Determine  the  95%  confidence  interval  for 
fXi-|X2,  where  |ti  and  |i2  are  the  means  of  HC1  concentrations. 


Problem  1.13 

In  order  to  compare  the  effects  of  two  solid  catalyst  component  con- 
centrations on  NO2  reductions,  six  groups  of  observations  were 
made.  Each  group  consisted  of  three  replicates  of  five  observations 
each,  that  is,  a total  of  15  determinations  were  made  for  each  con- 
centration. The  concentrations  (in  mass  per  cent)  were  0.5  and 
1.0%.  The  reduction  data  are  summarized  below. 


Catalyst  type  A : 5.18  5.52  5.42 

Catalyst  type  B : 5.58  5.62  5.82. 

Do  the  catalysts  have  the  same  efficiency  rate? 


Problem  1.14  [4] 

Five  similar  determinations  of  the  cold  water  flow  rate  to  a heat 
exchanger  were,  in  [GPM]:  5.84;  5.76;  6.03;  5.90;  5.87.  Compute  a 
95%  confidence  interval  for  the  imprecision  affecting  this  measur- 
ing operation? 


Problem  1.15  [4] 

Reaction  temperatures  in  degrees  centigrade  (measured  on  two  dif- 
ferent days)  for  two  catalyst  concentrations  were: 


Concentration  A:  310.95;  308.86;  312.80;  309.74;  311.03;  311.89; 

310.93;  310.39;  310.24;  311.89;  309.65;  311.85; 
310.73. 


Concentration  B:  308.94;  308.23;  309.98;  311.59;  309.46;  311.15; 

311.29;  309.16;  310.68;  311.86;  310.98;  312.29; 
311.21. 


Find  a 98%  confidence  interval  for  a1  /o2. 


1.4  Tests  and  Estimates  on  Statistical 


Problem  1.16 

Data  taken  from  the  plate  and  frame  filter  press  located  in  the  unit 
operations  laboratory  are  used  to  determine  a,  the  specific  cake 
resistance,  of  a calcium  carbonate  slurry.  Several  values  of  a, 
expressed  in  ft/lb,  have  been  calculated  from  data  taken  during  the 
fall  semester. 


2.49x  1011 ; 2.40x  1011 ; 2.43 x 1011 ; 2.30x10° ; 2.53 x 10° ; 
2.67x10° ; 2.60x  10° ; 2.50x  10° ; 2.54x  10° ; 2.55 x 10° . 


Based  on  these  values,  predict  the  interval  within  which  90%  of  all 
such  values  calculated  in  the  future  must  fall. 


Problem  1.17 

Due  to  the  burning  of  cotton  plant  wastes  (hulls,  leaves,  etc.),  the 
sulfate  content  in  the  air  over  a town  is  highest  during  the  month  of 
November.  If  the  data  given  below  are  the  mean  values  of  sulfate 
content  during  the  month  of  November  (analysis  of  air  perform 
daily)  over  the  past  10  years. 


2—  3 

Sulfate  content  SO4  mg/m  : 


10.83;  8.90;  14.71;  12.35;  11.86; 
13.80;  11.75;  9.68;  9.33;  10.9. 


Determine  the  95%  confidence  interval  during  next  November. 


Problem  1.18  [4] 

A company  engaged  in  the  manufacture  of  cast  iron  has  employed  a 
system  of  raw  material  and  processing  procedures  that  has  pro- 
duced a product  whose  overall  population  average  silicon  content 
was  0.85%.  A new  contract  was  put  into  effect  in  which  a new  sup- 
plier of  raw  material  supplanted  the  old  one.  During  the  first  month 
of  operation  using  the  new  material,  random  samples  of  the  product 
silicon  content  were  found  to  be: 


1.13; 

0.80; 

0.85; 

0.60; 

0.97; 

0.92; 

0.94; 

0.72; 

1.17; 

0.87; 

0.36; 

0.68; 

0.73; 

0.82; 

0.79; 

0.87; 

0.92; 

0.81; 

0.97; 

0.48; 

1.00; 

0.92; 

0.61; 

0.81; 

0.71; 

0.97; 

0.89; 

0.68; 

1.00; 

1.16. 

What  are  your  99%  confidence  limits  on  the  silicon  content  of  the 
iron  using  the  new  raw  material. 


Problem  1.19  [4] 

An  experiment  conducted  to  compare  the  tensile  strengths  of  the 
types  of  synthetic  fibers  gave  the  breaking  strength  shown  below  in 
thousands  of  pounds  force  per  square  inch  (PSI): 


Fiber  A:  14  4 10  6 3 11  12 

Fiber  B:  16  17  13  12  7 16  11  8 7. 


Calculate  the  99%  confidence  interval  for  the  difference  between  the 


means. 
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■ Problem  1.20  [4] 

The  following  data  were  obtained  for  the  calibration  of  the  Ruska 
dead  weight  gauge  used  with  our  Burnett  PVT  Apparatus.  The  weights 
corresponding  to  the  1.000  PSI  loading  had  the  following  apparent 
masses: 

26.03570;  26.03581;  26.03529;  26.03573;  26.03575;  26.03551; 
26.03588;  26.03586;  26.03599;  26.03533;  26.03570. 

Can  we  say  that  the  average  apparent  mass  does  not  exceed  26.5? 


Problem  1.21 

Fifty  determinations  of  a certain  concentration  yielded  the  following 
values: 


54.20 

51.73 

52.56 

53 

53.08 

53.82 

54.15 

53 

54.96 

58.51 

54.65 

55 

53.95 

53.39 

54.30 

52 

56.78 

56.00 

57.27 

54 

56.91 

52.35 

52.02 

52 

56.60 

55.21 

55;  56.15;  57.50;  54.25; 
10;  51.56;  53.43;  53.77; 
13;  51.12;  53.73;  55.01; 
89;  57.35;  55.77;  52.22; 
89;  57.05;  56.25;  56.35; 
94;  58.16;  57.73;  55.33; 


54.46; 

55.88; 

55.57; 

54.55; 

56.52; 

54.13; 


Test  the  hypothesis  Ho  : |X>55.0  with  the  99%  confidence  level. 


Problem  1.22 

Observing  the  values  in  Problem  1.12,  there  is  enough  reason  to 
believe  that  both  bottles  were  filled  up  with  HC1  in  the  same  produc- 
tion line.  The  mean  variance  of  concentration  for  the  last  year  of 
production  is  0.016.  Can  we  trust  the  assumption  that  both  bottles 
were  filled  with  HC1  from  the  same  batch? 


Problem  1.23 

On  a pilot-plant  for  producing  composite  rocket  propellant  four 
batches  of  the  same  composition  were  made  under  the  same  process 
conditions  with  the  batch  of  10  micron  ammonium  perchlorate 
ground  at  that  moment.  A month  later  the  same  ammonium  per- 
chlorate was  used  to  make  three  batches  of  the  composite  propel- 
lants with  the  same  composition.  From  the  obtained  propellant 
experimental  rocket  motors  were  static  fired.  The  following  burning 
rates  at  70  [bar]  pressure  were  determined  from  the  calculated  burn- 
ing rate  laws: 

Batch  of  AP  A:  14.199;  14.531;  14.197;  14.193 

Batch  of  APB:  14.398;  14.418;  14.307. 

Test  the  means  of  the  composite  propellant  burning  rates  at  70  [bar]. 


■ Problem  1.24 

| Test  the  variances  in  the  previous  example. 


7.5  Analysis  of  Variance 
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Problem  1.25 

On  a pilot-plant  for  producing  composite  rocket  propellant  four 
batches  of  the  same  composition  were  made  under  the  same  process 
conditions  with  the  two  different  burning-rate  catalysts.  From  the 
obtained  propellant  experimental  rocket  motors  were  static  fired. 
The  following  burning  rates  at  70  [bar]  pressure  were  determined 
from  the  calculated  burning  rate  laws: 


Catalyst  A:  14.199;14.531;14.197:14.193;  Catalyst  B:  15.716;  15.612; 
15.682;  15.715. 


Are  there  significant  differences  in  catalyst  effects  on  burning  rates? 


1.5 

Analysis  of  Variance 

The  technique  known  as  analysis  of  variance  (ANOVA)2^  uses  tests  based  on  variance 
ratios  to  determine  whether  or  not  significant  differences  exist  among  the  means  of 
several  groups  of  observations,  where  each  group  follows  a normal  distribution.  The 
analysis  of  variance  technique  extends  the  t-test  used  to  determine  whether  or  not 
two  means  differ  to  the  case  where  there  are  three  or  more  means. 

The  analysis  of  variance  is  used  very  widely  in  the  biological,  social  and  physical 
sciences.  The  technique  was  first  developed  by  R.  A.  Fisher  and  his  colleagues  in 
England  in  the  1920s.  Fisher  has  said  that  the  analysis  of  variance  is  merely  “a  con- 
venient way  of  arranging  the  arithmetic".  This  statement  points  out  that  the  statistical 
principles  underlying  the  analysis  of  variance  are  quite  simple;  but  the  calculations 
can  become  quite  involved,  so  that  they  require  careful  and  systematic  arrangement. 

Analysis  of  variance  is  particularly  useful  when  the  basic  difference  between  the 
groups  cannot  be  stated  quantitatively.  For  example,  suppose  we  wish  to  determine 
whether  there  are  any  differences  among  the  effects  of  four  polymerization  catalysts 
on  the  setting  time  of  a particular  plastic.  We  make  several  runs  under  identical  con- 
ditions with  each  catalyst.  We  can  then  determine  whether  the  mean  setting  times 
for  the  four  catalysts  are  different  by  using  a one-way  analysis  of  variance  to  deter- 
mine the  effect  of  one  independent  variable  (type  of  catalyst)  on  the  dependent  vari- 
able (setting  time).  However,  we  cannot  describe  the  type  of  catalyst  by  a quantitative 
relationship.  On  the  other  hand,  we  might  run  a similar  experiment  in  which  we 
use  four  different  concentrations  of  a single  catalyst.  Now  we  can  relate  the  four 
groups  quantitatively  by  concentration  of  catalyst.  We  could  still  use  the  analysis  of 
variance  to  see  whether  a change  in  concentration  had  any  effect. 

We  might  extend  our  first  example  using  four  different  catalysts  to  study  the 
effect  of  temperature  as  well.  We  could  pick  three  different  temperatures  and  deter- 
mine the  setting  rate  for  each  of  the  four  catalysts.  This  would  require  a two-way 
analysis  of  variance  to  determine  significant  differences  among  the  12  setting  times 
that  we  would  obtain. 


2)  Dispersion  analysis 


7.5  Analysis  of  Variance 
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Problem  1.25 

On  a pilot-plant  for  producing  composite  rocket  propellant  four 
batches  of  the  same  composition  were  made  under  the  same  process 
conditions  with  the  two  different  burning-rate  catalysts.  From  the 
obtained  propellant  experimental  rocket  motors  were  static  fired. 
The  following  burning  rates  at  70  [bar]  pressure  were  determined 
from  the  calculated  burning  rate  laws: 


Catalyst  A:  14.199;14.531;14.197:14.193;  Catalyst  B:  15.716;  15.612; 
15.682;  15.715. 


Are  there  significant  differences  in  catalyst  effects  on  burning  rates? 


1.5 

Analysis  of  Variance 

The  technique  known  as  analysis  of  variance  (ANOVA)2^  uses  tests  based  on  variance 
ratios  to  determine  whether  or  not  significant  differences  exist  among  the  means  of 
several  groups  of  observations,  where  each  group  follows  a normal  distribution.  The 
analysis  of  variance  technique  extends  the  t-test  used  to  determine  whether  or  not 
two  means  differ  to  the  case  where  there  are  three  or  more  means. 

The  analysis  of  variance  is  used  very  widely  in  the  biological,  social  and  physical 
sciences.  The  technique  was  first  developed  by  R.  A.  Fisher  and  his  colleagues  in 
England  in  the  1920s.  Fisher  has  said  that  the  analysis  of  variance  is  merely  “a  con- 
venient way  of  arranging  the  arithmetic".  This  statement  points  out  that  the  statistical 
principles  underlying  the  analysis  of  variance  are  quite  simple;  but  the  calculations 
can  become  quite  involved,  so  that  they  require  careful  and  systematic  arrangement. 

Analysis  of  variance  is  particularly  useful  when  the  basic  difference  between  the 
groups  cannot  be  stated  quantitatively.  For  example,  suppose  we  wish  to  determine 
whether  there  are  any  differences  among  the  effects  of  four  polymerization  catalysts 
on  the  setting  time  of  a particular  plastic.  We  make  several  runs  under  identical  con- 
ditions with  each  catalyst.  We  can  then  determine  whether  the  mean  setting  times 
for  the  four  catalysts  are  different  by  using  a one-way  analysis  of  variance  to  deter- 
mine the  effect  of  one  independent  variable  (type  of  catalyst)  on  the  dependent  vari- 
able (setting  time).  However,  we  cannot  describe  the  type  of  catalyst  by  a quantitative 
relationship.  On  the  other  hand,  we  might  run  a similar  experiment  in  which  we 
use  four  different  concentrations  of  a single  catalyst.  Now  we  can  relate  the  four 
groups  quantitatively  by  concentration  of  catalyst.  We  could  still  use  the  analysis  of 
variance  to  see  whether  a change  in  concentration  had  any  effect. 

We  might  extend  our  first  example  using  four  different  catalysts  to  study  the 
effect  of  temperature  as  well.  We  could  pick  three  different  temperatures  and  deter- 
mine the  setting  rate  for  each  of  the  four  catalysts.  This  would  require  a two-way 
analysis  of  variance  to  determine  significant  differences  among  the  12  setting  times 
that  we  would  obtain. 

2)  Dispersion  analysis 
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Finally,  we  might  add  a third  independent  variable,  catalyst  concentration,  to  type 
of  catalyst  and  temperature  of  setting.  We  then  would  use  a three-way  analysis  of  var- 
iance to  determine  whether  differences  in  means  exist.  Of  course,  as  additional  inde- 
pendent variables  are  added,  the  calculations  become  much  more  complex,  so  that 
they  are  better  carried  out  on  a digital  computer. 

Notation  and  arithmetic 

Although  the  principles  of  analysis  of  variance  are  simple,  the  notation  and  arith- 
metic can  be  quite  confusing  at  first  contact.  For  this  reason,  we  begin  by  discussing 
a few  conventions  in  notation  and  arithmetic  that  we  will  use  later.  Suppose  we  have 
the  following  matrix: 

Table  1.14  Data  matrix 


Column 

1 

2 

i 

) 

1 

Xu 

Xl2 

xn 

Xu 

2 

X2i 

X22 

X2j 

X2j 

R 

O 

Xu 

xi2 

Xij 

Xi, 

W 

I 

x„ 

Xl2 

Xjj 

x„ 

Sums  I 

X.! 

X.2 

X.i 

X-I 

Means 

X.1 

X.2 

• : x* : 

: : x. 

Each  data  point  is  subscripted,  first  to  identify  its  column  location  and  second  to 
identify  its  row  location.  Thus,  X32  (read  “X”  three  two)  is  the  data  point  in  the  third 
column  and  second  row.  Each  column  may  be  regarded  as  size  I random  sample 
drawn  from  the  normal  population.  This  matrix  might  represent  the  example  of 
one-way  analysis  of  variance  given  earlier.  The  columns  would  be  the  four  catalysts 
and  the  rows  would  simply  identify  the  succesive  runs  made  at  identical  conditions. 

In  the  two-way  example,  the  columns  would  again  be  the  four  catalysts,  the  rows 
would  be  the  three  temperatures,  and  each  X would  be  a single  value  of  the  setting 
time  for  a given  temperature  and  catalyst.  Thus,  X32  is  the  setting  time  using  the 
third  catalyst  and  the  second  temperature. 

We  designate  a general  location  in  the  matrix  of  data  as  X;j,  where  i refers  to  the 
column,  and  j to  the  row.  The  sum  of  values  in  the  i-th  column  is: 

J 

Xi-  = EXv 

J=  i 


(1.106) 
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The  dot  refers  to  the  direction  that  has  been  summed.  The  mean  of  the  values  in 
the  i-th  column  is  then: 

K = y (1-107) 

Similarly,  the  sum  for  any  row  j is: 

i 

x.j  = Y,xij  f1-108) 

i—l 

ant  the  mean  is: 

X.j  = -f  (1.109) 

The  sum  of  all  the  values  in  the  matrix  is  designated  by  X..  where: 
i J i J 

x-  = E E xi/  = E x>.  = E x-i  a-110) 

i=l  j=  1 i—l  j=  1 

The  mean  of  all  the  values  in  the  matrix  is  called  the  grand  mean  X..  where: 

X.. 


X..  =: 


(1.111) 


U 

From  here  on,  to  simplify  the  equations,  we  will  designate: 

E=E:  E=E 

7—1  i j=l  j 

The  same  as  before,  capital  letters  denote  random  variable  and  the  small  ones  the 
concrete  value  of  the  variable. 


One-way  analysis  of  variance 

In  one-way  analysis  of  variance,  we  have  several  groups  for  which  we  wish  to  test 

equality  of  means.  To  apply  the  standard  methods,  we  must  assume  that  each  group 

2 

is  normally  distributed  and  that  the  population  variance  ax  is  constant  among  the 

groups.  In  other  words,  one-way  analysis  of  variance  is  used  in  situations  when  we 

want  to  test  the  J population  means.  The  procedure  is  based  on  the  assumption  that 

each  I group  of  observations  is  a random  sample  from  normal  population  with  its 
2 2 

variance  ax  = a that  is  mutual  for  all  the  groups. 

Actually,  analysis  of  variance  may  be  applied  where  these  criteria  are  not  exactly 
met;  but  the  procedure  is  based  on  these  assumptions. 

Suppose  we  wish  to  estimate  the  population  variance.  There  are  two  possible  esti- 
mates. First,  we  might  use  a pooled  estimate  such  as  that  in  an  earlier  section,  in 
which  we  used  the  t-test  on  two  means.  A second  method  of  estimating  the  popula- 
tion variance  is  to  calculate  the  variance  of  the  group  means  around  the  grand 


mean. 
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Pooled  variance  estimates 

In  the  pooled  estimate,  we  calculate  the  sample  variance  for  each  group  (each  col- 
umn of  data  in  a one-way  analysis).  Then  we  weigh  each  of  these  estimates  by  its 
degrees  of  freedom  to  obtain  a pooled  sample  variance.  For  any  column  i,  the  sam- 
ple variance  is: 

2 ^(x-i)1 

Sj  = (1.112) 


We  have  already  asserted  that  E ( Sj  ) = a2  for  each  j.  We  will  assume  for  simplici- 
ty that  each  column  contains  the  same  number  of  values  (i.e.  J is  constant).  We  have 
J estimates  of  the  form  of  Eq.  (1.112).  To  pool  them,  we  weigh  each  by  its  degrees  of 
freedom  (1-1),  add  and  divide  by  the  total  degrees  of  freedom  J(I-l):  The  pooled  sam- 
ple variance3*  is  then: 


e [d-Ds;] 

J j(i- 1) 


(1.113) 


Combining  Eqs.  (1.112)  and  (1.113)  gives: 


E 

Ev*,)’ 

j 

i 

J(i- 1) 

J(I~  1) 


MSW 


(1.114) 


In  this  equation,  the  term  SSW  is  refereed  to  as  the  the  sum  of  squares  within 
groups  or  error  sum  of  squares.  The  quantity  SSW  when  divided  by  the  appropriate 
degrees  of  freedom  J(I-l)  is  referred  to  as  the  mean  square  or  error  mean  square 
and  is  denoted  by  MSW.  As  Eq.  (1.114)  is  not  particularly  convenient  for  calculation 
purposes,  it  can  be  presented  in  the  more  usable  form: 


EE  Ei  - E<- 


2 * 1 

s;  = — 


j(i- 1) 


■ - = MSV 


J(I~  1) 


(1.115) 


The  pooled  estimator  of  the  population  variance,  Sp  , is  an  unbiased  estimator  for 
a2  regardless  of  whether  the  population  means  [tj  are  equal  or  not,  because 

it  takes  into  account  deviations  from  each  group  mean  X#J-,  j=l,  2,...,  J.  Unbiasedness 
follows  from  Eq.  (1.114)  since: 


E 


E*(E  V*,- 


(i-r 


E e(sJ) 


(1.116) 


3)  Pooled  sample  variance 


7.5  Analysis  of  Variance  | 67 

Variance  of  group  means 

A second  method  of  estimating  the  population  variance  a2  is  to  calculate  the  sample 
variance  of  the  group  means  around  the  grand  mean  by  use  of: 


si  = 


X -X., 
v 


J-i 


(1.117) 


If  the  group  population  means  are  all  equal,  then  S % is  an  unbiased  estimate  of 
the  variance  of  the  population  mean  a % . To  obtain  an  estimator  of  the  population 
variance  a2  recall  that: 


,2  _S^ 
^ I 


(1.118) 


Combining  Equs.  (1.117)  and  (1.118)  we  have  our  second  estimator  of  the  popula- 
tion variance  a2: 


S 


2 


*e(v* 


2 


J- 1 


SSj, 

J-i 


msb 


(1.119) 


The  quantities  SSB  and  MSB  are  usually  referred  to  as  the  between  groups  sum  of 
squares  and  mean  square  for  between  groups,  respectively.  Eq.  (1.119)  is  not  suitable 
for  practical  calculations  so  it  is  transformed  into  the  following  expression: 


S 


2 


ss, 

J- 1 


= msr 


(1.120) 


The  estimator  S2  given  in  Eqs.  (1.119)  and  (1.120)  is  an  unbiased  estimator  of  a2 
only  when  the  group  population  means  are  equal.  If  the  population  means 
m,  p2,...,  Pj  are  not  all  ec^ual  then  the  estimator  S2  overestimates  a2,  that  is  E(S2  )>o2. 
The  estimators  S2  and  Sp  are  linked  by  a very  important  identity  given  by: 


EE(xr^)2  = ;E  (*•/  - ft 2 + E E ft  - ft)2  (1121> 

j > j i * 


The  left-hand  side  of  Eq.  (1.121)  is  usually  referred  to  as  the  total  sum  of  squares 
corrected  for  the  mean  and  is  denoted  by  SSTC.  Combining  Eqs.  (1.114),  (1.119)  and 
(1.121)  gives: 

SStc=SSb+SSw  (1.122) 

In  other  words  the  total  variations  are  partitioned  into  two  components,  a compo- 
nent SSB  that  reflects  variation  among  groups  and  a component  SSW  that  reflects 
experimental  error  or  sampling  variation.  The  degrees  of  freedom  associated  with 
SSTC  are  also  partitioned  into  the  degrees  of  freedom  associated  with  SSB  and  SSW, 
i.e.,  IJ-1=(J-1)+J(I-1).  If  the  means  p, , j. i2 ,...,  pj  are  all  equal  then  Sp  and  S2  are  inde- 
pendent so  that  the  random  variable: 
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S2 1 _ msr 

s2  MSW 

J(J-i) 


(1.123) 


has  an  F -distribution  with  J-l  and  )(I-1)  degrees  of  freedom.  The  Eq.  (1.123)  illus- 
trates the  variance  analysis  basis;  two  variances  were  compared  in  order  to  test  the 
mean  equality.  Thus  under  H0:  |ri=|i2=...=|tj  we  would  expect  the  value  of  F to  be 
close  to  1.  If  H0  is  not  true  then  the  value  S2  would  tend  to  be  larger  than  Sp,  which 
would  force  F to  be  larger  than  1.  Consequently,  based  on  the  data,  the  hypothesis 
H0  would  be  rejected  if  the  computed  F-value,  is  too  large.  That  is,  the  rejection  re- 
gion is  of  the  form: 


F Fk^.l-a 
where: 


k!=J-l  andk2=J(I-l) 

Fisher  introduced  the  following  table  for  a clear  presentation  of  variance  analysis 
results: 


Table  1.15  One-way  ana 

lysis  of 

variance 

Source  of  variation 

f 

Sum  of  squares 

Mean  square 

Test  statistic 

Between  columns 

1-1 

SSB  = 

zK  2 

j X.. 

_ SSB 

msb 

I IJ 

B 7-i 

MSW 

Within  columns  - error 

J(I-l) 

Oo 

00 

II 

ee^  x; 

i j 

,,c  SSw 

MSxY/  — — 7 

w 7(1-1) 

- 

Total 

JI-1 

II 

Fl 

oo 

oo 

EE  xl~ 

i j J 

- 

- 

Example  1.23  [11] 

A quantity  of  each  of  three  chemical  fertilizers  was  applied  to  three  groups  of  five 
corn  plants  each,  with  all  plants  growing  under  identical  conditions  of  temperature, 
rainfall,  soil,  seed,  etc.  From  the  following  measures  of  corn  growth  (height  after 
one  month),  determine  whether  there  is  any  reason  for  one  fertilizer  to  be  better 
than  another: 
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Fertilizer 

Nol 

No2 

No3 

23 

16 

18 

21 

23 

22 

24 

20 

25 

17 

21 

21 

19 

18 

20 

X#1=104 

OO 

0 

II 

X* 

X.j=106 

X.1=20.8 

X.2=19.6 

X.3=21.2 

First,  we  evaluate  the  summations: 

( xy )2  = 232  + 232  + 242  + 172  + 192  + 1&2  + - + 2°2  = 6420 


= 1042 + %2  + 1062  = 31656 

j 


X.2.  = (104  + 98  + 106)2=  94864 


Then  from  Equs.  (1.115)  and  (1.120),  we  get: 

31656 

— §-=  7.40;  Iq  = 12 


si  = 


6420- 


S = 


3(5-l) 

31656  94864 

ix5_=  3.50;  fc,  = 2 
3-1  ’ 2 


_ 5 


From  this  point,  we  may  use  the  7-step  test  procedure: 

1.  Assume  underlying  normally  distributed  populations  for  each  fertilizer 
group.  All  groups  have  constant  population  variance  0^  = ao  = c?3 . 

2.  H0:(.ii=p.2=ii3  Hj:  H0  is  not  true 

c2 

3.  Test  statistic:  F = 

S 2 

4.  Let  a=0.05 

5.  We  reject  H0  if  F is  higher  than  F0  95(2;i2)=3.88 

6.  F=3. 50/7. 40=0.47 

7.  Accept  H0  since  0.47<3.88 

Thus,  there  is  no  reason  to  believe  that  one  fertilizer  promotes  growth  more  than 
another.  Generally  speaking,  analysis-of-variance  problems  are  not  solved  in  the 
form  used  in  this  example.  A standard  form  called  the  analysis-of-variance  table  has 
been  developed,  and  it  is  particularly  useful  for  more  complex  problems: 
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SST=  EE^-^  = 6420-^4  = 95.S 


SSB  = 


> 3 

y' 

j 'J  Xl.  31656  94864 


= 7.0 


I IJ  5 3x5 

SSW  = SST-SSB  = 95.8-  7.0  = 88.8 


Table  1.16  Analysis  of  variance 


Source  of  variation 

f 

SS 

MS 

F 

^2;12;0.95 

Between  groups 

2 

7.0 

3.50 

0.47 

3.88 

Within  groups 

12 

88.8 

7.40 

- 

- 

Total 

14 

95.8 

- 

- 

- 

Example  1.24  [4] 

The  amount  of  fluoride  in  the  local  water  supply  was  determined  by  the  four  colori- 
metric methods  in  a comparative  study  A,  B,  C and  D.  Five  replications  were  made 
for  each  test.  To  preclude  bias  from  variations  in  the  sample  over  the  time  required 
for  the  analysis,  all  samples  were  taken  from  a single  10-gal  carboxy  of  water.  The 
results  in  ppm  are: 

A:  2;  3;  6;  5;  4; 

B:  5;  4;  4;  2;  3; 

C:  1;  3;  2;  4;  4; 

D:  2;  1;  1;  2;  1. 

a)  Are  the  methods  equivalent?  Use  the  5%  significance  level. 

b)  What  are  the  95%  confidence  limits  on  the  values  obtained  from  each  meth- 
od. 


a)  We  first  calculate  the  required  sums  and  squares: 


X.A  = 

20; 

E X?a  = 90; 

X.A  = 4.0; 

E 4 = 969; 

X,B  = 

18; 

Ex?b  = 70; 

i 

X.B  = 3.6; 

J 

EE^  = 217; 

» j 

X.c  = 

14; 

E 4 =46; 

i 

X.c  = 2.8; 

X..  = 59; 

XD  = 

7; 

Exfn  = 11; 

XD  = 1.4. 

We  calculate  the  pooled  sample  variance: 


E4 


sl  = 


EEV“ 

JJ(i- 1) 


217-193.8 


= 1.45 


16 
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Table  1.17  Analysis  of  variance 


Source  of  variation 

f 

SS 

MS 

F 

^3;1 6.0,95 

Between  groups 

3 

19.75 

6.58 

4.54 

3.24 

Error 

16 

23.20 

1.45 

- 

- 

Total 

19 

42.95 

- 

- 

- 

Based  on  F test  reject  H0  and  conclude  that  there  are  significant  differences 
among  the  applied  methods. 

b)  For  16  degrees  of  freedom,  at  the  95%  confidence  level,  ti6;0.975=2.120  and 
ti6;o.025=-2.120.  The  standard  error  is  found  as  before: 


Sp  _VlA5 
yfl~  V5 


0.5386 


The  confidence  limits  for  the  means  corresponding  to  the  four  colorimetric  meth- 
ods are: 

|tA  : 4.012.12  x 0.5386  ^2.96<pA<5.04 
jxB  : 3.612.12  x 0.5386  ^2.56<pB<4.64 
pC  : 2.812.12  x 0.5386  =>1.76<pC<3.84 
pD  : 1.412.12  x 0.5386  =>0.36<pD<2.44 


Example  1.25  [12] 

The  conductivity  of  four  different  coatings  on  cathode  tubes  was  tested.  As  only  four 
types  of  coatings  were  tested,  we  had  a one-way  experiment  on  four  levels.  The  men- 
tioned levels  were  qualitative  as  we  had  no  quantitative  measure  for  coating  types.  Five 
cathode  tubes  were  tested  for  each  coating.  The  sequence  of  conductivity  measurements 
was  completely  random.  The  obtained  results  are  given  in  the  following  table: 

Coating 

I II  III  IV 
56  64  45  42 
55  61  46  39 
62  50  45  45 

59  55  39  43 

60  56  43  41 

If  we  subtract  50  from  each  value  we  shall  obtain  coded  values,  which  to  a great 
extent  will  simplify  the  arithmetic  and  has  no  influence  on  the  F-statistic. 
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I 

II 

III 

IV 

6 

14 

-5 

-8 

5 

11 

-4 

-11 

12 

0 

-5 

-5 

9 

5 

-11 

-7 

10 

6 

-7 

-9 

X.j  = 42; 

36; 

-32; 

-40; 

X..  =6 

I 5; 

5; 

5; 

5; 

I x J=20 

EXj=  386 

378 

236 

340 

= 1340 

Y2  r2 

SST  =VVXj  -^=  1340  — — = 1338.2 

t z^z^  y jj  20 

» j 


ssR  = 


Ex.j  2 

, X..  5684  6 


1 


V 


20 


= 1135.0 


SSW  = SST  - SSR  = 1338.2  - 1135.0  = 203.2 


Table  1.18  Analysis  of  variance 


Source  of  variation 

f 

SS 

MS 

F 

F3;i6;0.95 

Between  groups 

3 

1135.0 

378.3 

29.8 

3.24 

Error 

16 

203.2 

12.7 

- 

- 

Total 

19 

1338.2 

- 

- 

- 

Since  the  calculated  F-criterion  value  is  greater  than  the  tabulated  one,  the  null 
hypothesis  can  be  rejected  with  95%  confidence  level,  i.e.  the  alternative  hypothesis 
that  there  is  a statistical  difference  between  the  used  coatings  is  accepted. 

Model  for  one-way  analysis  of  variance 

Up  to  now  the  technique  of  calculations  in  analysis  of  variance  has  been  analyzed  in 
more  detail.  Now  let  us  briefly  consider  the  analysis  of  variance  theory.  Let  us  con- 
sider the  model  for  a one-way  analysis  of  variance.  Here  it  is  assumed  that  the  col- 
umns of  data  are  J-random  samples  from  J-independent  normal  populations  with 
means  p1,p2,---4ij,  and  common  variance  a2.  The  one-way  analysis  of  variance  tech- 
nique will  give  us  a procedure  for  testing  the  hypothesis:  H0:  (t1=p2=...=(tj;against 
the  alternative  Ht:  at  least  two  p;  not  equal.  The  statistical  model  gives  us  the  struc- 
ture of  each  observation  in  the  IxJ  matrix: 

XipM-j+Eij  (1.124) 

This  model  says  that  the  dependent  variable  X;j  is  made  up  of  two  parts:  the  first 
part  p that  is  the  mean  of  the  population  corresponding  to  the  j-th  column  (popula- 
tion) and  the  second  part,  e^,  the  random  experimental  error  that  is  taken  to  have 
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mean  0,  i.e.,  E(eij)=0.  This  must  be  the  case  since  E(X;j)=|.iij.  The  model  in  Eq.  (1.124) 
can  be  written  as: 


[i  = (X  = — is  called  the  grand  population  mean 

ctj  = p.  — p,  is  called  the  effect  of  the  j-th  population. 

Eq.  (1.125)  states  that  any  experimental  value  is  the  sum  of  a term  representing 
the  general  location  of  the  grand  population  mean  plus  a term  a,  showing  the  dis- 
placement of  a given  population  from  the  general  location,  plus  a term  giving  the 
random  experimental  error  Ejj  of  the  particular  observation. 

The  Ejj  are  independent  and  normally  distributed  with  mean  0 and  variance  o2, 
and  it  is  the  result  of  random  fluctuations  in  the  process  and  measurement  errors. 
The  population  grand  mean  may  be  considered  the  main  addend  for  Xjj. 

The  a,  is  the  column  contribution  (the  contribution  that  arises  if  the  column  pop- 
ulation means  are  different  so  that  each  column  mean  would  be  different  from  the 
grand  population). 

This  means  that  a,  has  already  been  defined  as: 

CLj  = [i . — (l  (1.126) 

Note  that  a,  is  a constant  for  any  column  in  a specific  analysis  of  variance  as 
shown  in  Fig.  1.17. 


Xij=|i.j+|i-p+Eij=|i+([ij-|i)+E.ij=p+aj+eij 

where: 


(1.125) 


~f(X) 


Population  2 

\ 


X ..  4-1*2 

U ctj 


Pi  _P2  P3 
P 

Figure  1.17  Model  defines  terms  for  one-way  AHOVA 
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If  all  the  columns  are  equal,  and  ctj=0.  Therefore,  the  hypothesis 

H0:(t1=|j,2=...=p.]  is  the  same  as  the  hypothesis  H, : aj=0  for  all  j.  One  is  generally 
interested  in  the  hypothesis  H0:  a1=a2=...=ccj=0,  which  states  that  there  is  no  popula- 
tion or  column  effect.  This  means  that  the  variation  in  X#1 ,X.2,  is  due  to 

experimental  error  and  not  to  any  difference  in  population  means.  To  test  the 
hypothesis  H0  we  use  the  F-statistic  given  in  Eq.  (1.123): 

ss b 

F = J— 1 = MSb 
MSW 

JO- 1) 

The  rejection  region  at  significance  level  a is  F>Fj-ij(n),i-a. 

Recall  from  Eq.  (1.116)  that: 

E(Sp)  = E(MSW)  = a2 

Furthermore,  if  H0:  a1=a2=...=(Xj  is  true,  then: 

E(S2)=E(MSB)=a2. 

However,  if  H0  is  not  true  then  E(S2):£a2.  To  show  this,  consider: 

E{s2)=°2+jJr-1J2aJ  a-127) 

J j 

Eq.  (1.116)  indicates  that  the  pooled  estimate  Sp  is  unbiased,  and  Eq.  (1.127) 
biased  estimation  S2.  For  the  F-test  we  use  the  unbiased  estimate  of  a2  in  the 
denominator  of  the  F-ratio  and  the  biased  estimate  in  the  numerator. 

Two-way  analysis  of  variance 

If  we  desire  to  study  the  effects  of  two  independent  variables  (factors)  on  one  depen- 
dent factor,  we  will  have  to  use  a two-way  analysis  of  variance.  For  this  case  the  col- 
umns represent  various  values  or  levels  of  one  independent  factor  and  the  rows  rep- 
resent levels  or  values  of  the  other  independent  factor.  Each  entry  in  the  matrix  of 
data  points  then  represents  one  of  the  possible  combinations  of  the  two  independent 
factors  and  how  it  affects  the  dependent  factor.  Here,  we  will  consider  the  case  of 
only  one  observation  per  data  point.  We  now  have  two  hypotheses  to  test.  First,  we  wish 
to  determine  whether  variation  in  the  column  variable  affects  the  column  means.  Sec- 
ondly, we  want  to  know  whether  variation  in  the  row  variable  has  an  effect  on  the  row 
means.  To  test  the  first  hypothesis,  we  calculate  a “between  columns”  sum  of  squares: 
and  to  test  the  second  hypothesis,  we  calculate  a “between  rows”  sum  of  squares.  The 
between-rows  mean  square  is  an  estimate  of  the  population  variance,  providing  that  the 
row  means  are  equal.  If  they  are  not  equal,  then  the  expected  value  of  the  between-rows 
mean  square  is  higher  than  the  population  variance.  Therefore,  if  we  compare  the  be- 
tween-rows mean  square  with  another  unbiased  estimate  of  the  population  variance, 
we  can  construct  an  F test  to  determine  whether  the  row  variable  has  an  effect.  Defi- 
nitional and  calculational  formulas  for  these  quantities  are  given  in  Table  1.19. 
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Table  1.19  Two-way  analysis  of  variance 


Source  of  Degrees  Sum  of  squares-definition  Sum  of  squares-  Mean  squares  Test 

variation  of  practical  calculation  statistic 

freedom 


Between  t.j 

columns 

Between  rows  1-1 


ssc  = ij:(x.j-x..y 

j 

ssR  =JE  (M.  -x..)2 


ZK, 


SSr  =- 


Exf. 


ss„  = - 


U 

xL 

U 


MSC 


= Ec 
J- 1 


msr 


7-1 


ME 

MSe 

ME 

MSe 


/ - - - \ 2 CC 

Residual  (I-1)(J-1)  SSE  = E E - X„  - X,j  + X.. ) SS£  = SST  MS,  = ©g 

i j _cc  _ CC  \ )\J  ) 

variance-error  °°c  i? 


Total 


IJ-1 


sst  = EE  (x,j  -x..)2 

* j 


SST 


= EEM- 


x.2. 

ij 


We  note  from  Table  1.19  that  the  sums  of  squares  between  rows  and  between  col- 
umns do  not  add  up  to  the  defined  total  sum  of  squares.  The  difference  is  called  the 
sum  of  squares  for  error,  since  it  arises  from  the  experimental  error  present  in  each 
observation.  Statistical  theory  shows  that  this  error  term  is  an  unbiased  estimate  of 
the  population  variance,  regardless  of  whether  the  hypotheses  are  true  or  not.  There- 
fore, we  construct  an  F-ratio  using  the  between-rows  mean  square  divided  by  the 
mean  square  for  error.  Similarly,  to  test  the  column  effects,  the  F-ratio  is  the  be- 
tween-columns  mean  square  divided  by  the  mean  square  for  error.  We  will  reject  the 
hypothesis  of  no  difference  in  means  when  these  F-ratios  become  too  much  greater 
than  1.  The  ratios  would  be  1 if  all  the  means  were  identical;  and  the  assumptions 
of  normality  and  random  sampling  hold.  Now  let  us  try  the  following  example  that 
illustrates  two-way  analysis  of  variance. 

Example  1.26  [11] 

Determine  whether  the  type  of  catalyst  or  temperature  has  any  effect  on  the  setting 
time  of  a new  plastic  from  the  following  data.  The  measured  variable-response  is 
elapsed  setting  time  (in  minutes)  to  a uniform  criterion  of  hardness. 


Temperature 

Catalyst 

[°C] 

Not 

No2 

No3 

No4 

25 

25 

28 

22 

24 

xx. 

= 99 

50 

27 

29 

23 

23 

x2. 

= 102 

75 

30 

32 

26 

29 

x3. 

= 117 

X#1  =82 

X' 

00 

II 

£ 

.3=71 

X.4  = 76 

X.. 

= 318 

The  analysis-of-variance  table  is  constructed  from  the  following  quantities  calcu- 
lated from  the  data: 


1=3;  J=4; 
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= 822  + 892  + 712  + 762  = 25  4 62 
j 

X.2.  = 3182  = 101124 

^>2  = 992  + 1022  + 1172  = 33894 

i 

EE  x2  = 252  + 272  + ...  + 232  + 292  = 8538 
> j 

Inspection  of  Table  1.20  shows  that  we  reject  the  hypotheses  of  no  effect  of  the 
column  variable  or  the  row  variable.  Both  type  of  catalyst  and  temperature  seem  to 
have  an  effect.  Of  course,  we  have  made  only  a preliminary  survey.  We  would  now 
take  more  data  to  determine  which  catalyst  was  best  and  to  evaluate  a quantitative 
relationship  on  the  temperature  effect. 


Table  1.20  Analysis  of  variance 


Source  of  variation 

f 

SS 

MS 

F 

Ft 

Between  columns 

3 

60.3 

20.1 

28.7 

F3;6 

;0.95=4.76 

Between  rows 

2 

46.5 

23.3 

33.3 

Fm 

;0.95=5.14 

Error 

6 

4.2 

0.7 

- 

- 

Total 

11 

111.0 

- 

- 

- 

Example  1.27  [4] 

In  an  experiment  to  determine  the  effects  of  varying  the  reflux  ratio  on  the  number 
of  required  stages  X;,-  used  in  the  separation  of  benzene  and  toluene,  four  different 
lab  groups  used  the  same  four  reflux  ratios  with  the  following  results: 


Reflux  ratio 


Lab  group 

Nol 

No2 

No3 

No4 

1 

11.4 

9.2 

7.5 

6.2 

Xi.= 

34.3 

2 

10.7 

8.6 

8.3 

5.9 

X2.= 

33.5 

3 

11.9 

8.7 

9.3 

5.4 

x3.= 

35.3 

4 

9.9 

9.0 

7.1 

5.6 

x3.= 

31.6 

X#1  =43.9 

X.2  = 35 

.5  X.3  = 32.2  X.4  = 23 

1.1  X..  = 

134.7 

SST  = 

EE^- 

1 j 

x.2. 

IJ 

1195.7- 

18144.09  _ 
4x4 

61.1644 

ssc  = 


EX, 

2 

i 


X..  _ 4757.91 

V 4 


- 1134.0056  = 55.4719 


4 
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2 

SSR  = X'J  = 4543 ' 39  - 1134.0056  = 1.8419 


SSE=SSrSSc-SSR=3.8506 


Table  1.21  Analysis  of  variance 


Source  of  variation 

f 

SS 

MS 

F 

Ft 

Reflux  ratio 

3 

55.4719 

18.4906 

43.222 

F3;9 

;0.95=3-86 

Lab  group 

3 

1.8419 

0.6139 

1.435 

F3;9 

;0.95=3-86 

Error 

9 

3.8506 

0.4278 

- 

- 

Total 

15 

61.1644 

- 

- 

- 

When  the  calculated  values  of  F are  compared  to  F3;9;o.95=  3.86  it  is  seen  at  once 
that  the  differences  in  the  number  of  required  stages  is  significantly  affected  by  the 
reflux  ratio.  No  significant  differences  between  lab  groups  were  found. 

Example  1.28  [12] 

Determine  whether  car  tires  of  different  producers  have  different  wear-out  rates 
after  40.000  km.  Mark  the  car  tire  types  as:  A,  B,  C and  D.  The  wear-out  of  car  tyres 
will  be  tested  on  four  types  of  cars.  As  each  car  needs  four  tyres,  the  experiment  will 
test  16  tyres,  four  of  each  type.  The  experiment  was  done  by  each  car  having  one 
type  of  each  tyre.  The  sequence  of  putting  tyres  on  wheels  was  completely  random 
in  order  to  center  and  eliminate  the  effect  of  differences  between  the  wheels,  if  it 
exists  at  all. 

Apart  from  determining  whether  the  car-tyre  type  has  a significant  influence  on 
wear-out,  decide  how  much  the  car  type  influences  it.  The  obtained  data  have  been 
given  in  millimeters. 

Table  1.22  Analysis  of  variance 


Car  type 

Tire  type 

X, 

A 

B 

C 

D 

I 

17 

14 

12 

13 

56 

II 

14 

14 

12 

11 

51 

III 

13 

13 

10 

11 

47 

IV 

13 

8 

9 

9 

39 

X., 

57 

49 

43 

44 

193  =X.. 

9 

£*s 

823 

625 

469 

492 

£ £^  = 2409 
* j 
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SSr=EE^-Tf=  2409- 


U 


37249 

16 


= 80.94 


E4  2 

SSr= 1 ^ = 9435  - 17249  = 3o.69 


I U 


16 


V x2 

SSR=^-^  = 946Z-  37249  = 38.69 


J U 4 

SSE=SSrSSc-SSR=11.56 


16 


Table  1.23  Analysis  of  variance 


Source  of  variation 

f 

SS 

MS 

F 

Ft 

Tire  type 

3 

30.69 

10.23 

7.99 

F3;9;0.95=3-86 

Car  type 

3 

38.69 

12.90 

10.08 

F3;9;0.95=3-86 

Error 

9 

11.56 

1.28 

- 

- 

Total 

15 

80.94 

- 

- 

- 

The  above  results  allow  us  to  conclude  with  95%  confidence  that  different  car  tyre 
types  and  different  types  of  cars  have  important  influences  on  tyre  wear-out.  The 
same  conclusion  can  be  drawn  even  at  99%  confidence  level  as  F3;9;o.99=6.99. 


Model  for  two-way  analysis  of  variance 

For  a two-way  analysis  of  variance  the  assumed  model  is: 


Xij=(r+ai+Pj+£ij 


(1.128) 


where: 

Xjj  is  assumed  to  come  from  a normal  population  with  mean  ru  and  variance  a2; 


EE  (T. 

* j 

IJ 


«;  = if  . - 11..; 

P = U . - |l.. : 


h.  = J ; h.j  I 

The  parameter  |i  is  the  contribution  of  the  grand  mean,  is  the  contribution  of 
the  i-th  level  of  the  row  variable,  Pj  is  the  contribution  of  the  j-th  level  of  the  column 
variable,  and  e4j  is  the  random  experimental  error.  The  model  in  Eq.  (1.128)  does  not 
contain  what  is  usually  referred  to  as  row-column  interaction;  that  is,  the  row  and 
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column  effects  are  additive.  The  restrictions  (or  assumptions)  related  to  the  model 
in  Eq.  (1.128)  are: 

E ai  = 0 and  E Pj  = 0 

* j 

These  follow  from: 

a»  = (h.  - K.  and  P,  = Kj  - 11..  ■ 

Two  hypotheses  related  to  the  model  are: 

H0:  M-1(=M-2(=— =M-i  and 
Ho:  M-(i=P(2=---=llj 

The  first  hypothesis  says  that  there  is  no  row  effect,  that  is,  the  means  across  rows 
have  the  same  value.  Similarly,  the  second  hypothesis  says  there  is  no  column  effect. 
The  two  hypotheses  above  can  be  written  equivalently  as: 

Ho:a1=a2=...=aI=0  (1.129) 

H0:  P1=P2=...=Pr0  (1.130) 

Furthermore,  it  can  be  shown  (just  as  for  the  one-way  model)  that  the  expected 
mean  squares  are: 

E(MSr)  = o2  a‘ 

i 

E(MSc)  = a2+jJ^J2tf 


E(MSE)=a2 

Thus,  to  test  H0:  all  apO,  we  use  the  F-statistic  with  1-1  and  (1-1) (J-l)  degrees  of 
freedom. 


MS, 

The  rejection  region  is:  F>F(|.lj(I.1)(j.1)1.a.  Similarly,  to  test  H0  all  (3j=0,  we  use 
F-statistic  with  J-l  and  (1-1)  (J-l)  degrees  of  freedom. 

F = MSc 

mse 

The  rejection  region  is:  F>F(j.1)  (I.1)(j.1)1.a. 

Confidence  intervals  and  tests  of  hypotheses 

In  the  two-way  model  for  analysis  of  variance: 

X^p+aj+pj+eij 
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it  may  be  of  interest  to  compare  the  effects  due  to  two  rows,  say  a!  and  a2,  if:  H0:  all 
apO  has  been  rejected.  The  mean  for  the  i-th  row  is: 

Xi.  = M-  + ai  + P.  + 


where: 


EP; 


P.  = ±y~  = 0 and  e;. 
since: 

Xj.  - xi.  = - «2  + £i.  - e2. 


E Eij 

J_ 

J ’ 

and  E (X1#  — X2. ) = cq  — a2 


since: 


E ^ e ^ = 0 for  all  i and  j 

Now  MSg  is  an  unbiased  estimator  for  a2  and  the  random  variable: 


Vx, 


XU-X2.-(at~a2) 


2 / 1 , 1 

O ( -+- 

J J 


xu-E.-K-a2) 

mse~ 

EJ 


(1.131) 


has  a t-distribution  with  (I-1)(J-1)  degrees  of  freedom. 

Referring  to  Eq.  (1.74)  (1-a)  100%  confidence  interval  for  ara2 
structed: 


XU  X2.  =*=  h,l-a/2 

where: 


ms 


can  be  con- 


(1.132) 


k=(I-l)(J-l). 

If  these  confidence  limits  cover  the  value  zero  then  we  accept  H0:  a!=a2,  that  is, 
there  is  no  significant  difference  between  the  effects  due  to  rows  1 and  2. 

They  can  also  be  used  for  any  pair  of  column  effects  if  the  difference  of  the  sam- 
ple means  is  replaced  by  X#1  — X.2  and  J is  replaced  with  I. 

Consider  the  random  variable  Eci^;>  where  Ci,  c2,...,  Ci  are  any  constants.  It  is 

i .22 

not  difficult  to  show  that  J2ciXi  has  mean  E ci(ll  + ai) an^  variance  ^ C;  o //. 

i 1 1 

Now  if  we  assume  E c;  = 0,  then  the  mean  becomes: 
i 

since: 

Y,c^  = ^J2ci = 0 

i i 

The  linear  combination  ci  at  is  called  a contrast  if  E ci  = 0- 

i i 
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Let  L = Y c;X;.  be  a contrast.  Then  L has  a normal  distribution  with  mean 
i 2 2 

Y c;a;  and  variance  Y ci  ° U-  Thus  the  random  variable: 

i i 

Yc.X.  -Yc.a. 

t—'  l !•  ^ l l 


T = 


L-\i 


MSi  „ 2 
; ^ ; 


(1.133) 


has  a t-distribution  with  (1-1)  (J-l)  degrees  of  freedom.  From  this  we  obtain  a (1-a)  x 
100%  confidence  interval  for  the  contrast  Yciai: 


MSr 


(1.134) 


where: 

k=(I-l)(J-l). 

Similarly,  for  the  contrast  ^ Oy  [3 . : 


0.5 


0.5 


F.  ajx>j  h,i- 


'/2  i 


MS-  Y fli  I aY  ■<  Y aiY + v 


/ MS 

"'2I  I 


Y< 


(1.135) 


Example  1.29 

Using  the  results  of  analysis  of  variance  in  Example  1.27  determine  the  95%  confi- 
dence interval  for  contrast  |3r|32. 

For  the  contrast  we  need  to  use  the  interval  given  in  Eq.  (1.135).  Since: 

a1=l,  a2=-l;  MSE=0.4278;  1=4;  k=(I-l)(J-l)=9;  t9i0.975=2.2  62;  X.x  =10.475; 

X.2  = 8.875, 

the  95%  confidence  interval  are: 

/n  4278  \ °-5 

Xu  -X2.  ± 2.62 ( — (l  + l)  J = 10.475  - 8.875  ± 1.046 
= 1.600  ± 1.046  =>  (0.544:2.646) 

Since  the  confidence  interval  does  not  contain  zero  we  can  say  that  |3r|32>0  or 
Pi>(32  or  that  the  effect  from  the  first  reflux  ratio  is  greater  than  that  from  the  second 
reflux  ratio.  Confidence  intervals  can  also  be  constructed  for  contrasts  in  the  one- 
way model  Xj^n+ctj+Sjj.  Confidence  limits  are: 
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'Lcjaj  = 


J2cj{vj-v)  = Eca- 

j j 


at  the  a level  of  significance  is: 


CjX’j  ± t;(I-l);l-a/2 
j 


mse  j 

J(I- 1)  / 

V / 


(1.136) 


Example  1.30 

For  the  fluoride  data  of  Example  1.24  determine  a 95%  confidence  interval  on  aA-aB. 
The  contrast  is: 


aA-aB=flA-Nlhrfl)=M-A-M.B- 

From  the  results  of  Example  1.24  we  have:  X.A  = 4.0:  X,B  = 3.6;  J(I-1)=4(5-1)=16; 
MSe=1.45;  Ca=1;  Cb=-1  and  t1S;0.975=2.120.  The  95%  confidence  limits  on  pA-flB  are, 
from  Eq.  (1.136): 

4.0  - 3.6  ± 2.120  Q + t)  ) = 0.4  ± 2.120  x 0.1904  =>  (-0.0036;  0.8036) 

Since  the  interval  contains  zero  we  conclude  that  there  is  no  difference  between 
methods  A and  B. 


Interaction 

Two-way  analysis  of  variance  (and  higher  classifications)  leads  to  the  presence  of  in- 
teractions. If,  for  example,  an  additive  A is  added  to  a lube  oil  stock  to  improve  its 
resistance  to  oxidation  and  another  additive,  B,  is  added  to  inhibit  corrosion  by  the 
stock  under  load  or  stress,  it  is  entirely  possible  that  the  performance  of  the  lube  oil 
in  a standard  ball-and-socket  wear  test  will  be  different  from  that  expected  if  only 
one  additive  has  present.  In  other  words,  the  presence  of  one  additive  may  adversely 
or  helpfully  affect  the  action  of  the  other  additive  in  modifying  the  properties  of  the 
lube  oil.  The  same  phenomenon  is  clearly  evident  in  a composite  rocket  propellant 
where  the  catalyst  effect  on  burning  rate  of  the  propellant  drastically  depends  on  the 
influence  of  fine  oxidizer  particles.  These  are  termed  antagonistic  and  synergistic 
effects,  respectively.  It  is  important  to  consider  the  presence  of  such  interactions  in 
any  treatment  of  multiply  classified  data.  To  do  this,  the  two-way  analysis  of  variance 
table  is  set  up  as  shown  in  Table  1.24. 

In  two-way  analysis  of  variance  with  no  replications,  interaction  of  factors  was 
part  of  the  experimental  error  or  more  precisely  of  the  residual  variance.  To  separate 
the  interaction  from  the  residual  variance  or  from  experimental  error  variance,  it  is 
necessary  to  replicate  design  point,  i.e.  all  combinations  of  rows  and  columns  k 
times. 


Table  1.24  Two-way  analysis  of  variance  with  interactions 
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Source  Degrees  Sum  of  squares-definition  Sum  of  squares-  Mean  Test 

of  variation  of  practical  calculation  squares  statistic 

freedom 


Between 

columns 

1-1 

r*E  (x* 

j 

Between 

rows 

1-1 

JKE  (X... 

-x.„)2 

Interaction 

columns- 

rows 

(J-l)P-l) 

*EE  (%-*». 

* j 

-x.j. +x...y 

Error 

IJ(K-l) 

EEE  (Xj/t  -x». 

i j t 

, -X.J.+X...)2 

Total 

IJK-1 

£££(% 
i I k 

-x...y 

Ex2, 

j 

X.2.. 

SS 

juc  c P 

MSr 

IK 

IJK 

mse 

Ex,2. 

X«M 

SS 

MS  R F 

msb 

JK 

IJK 

R i-i 

mse 

EEx2. 

* j 

EX2. 

SS 

MS  CR  P 

mscr 

K 

JK 

ScR  (t-i)(/-i)F 

mse 

EX2,  2 

-J— 

IK  IJK 


SSE=SST—SSC 

—SSr—SSCr 


EEE  Kw-xL/ijk 

i ] k 


Three-way  analysis  of  variance 

Based  on  models  and  assumptions  of  one-way  and  two-way  analyses  of  variance 
with  or  without  replications  of  design  points,  it  is  possible  to  generalize  for  multi- 
ple-way analysis  of  variance.  It  is  of  interest  to  present  the  three-way  analysis  of  var- 
iance for  it  is  used  quite  often.  In  the  case  of  a three-way  analysis  of  variance  the 
total  number  of  observations  is  N=IxJxKxL,  where  I,  J and  K are  numbers  of  levels 
or  columns,  rows  and  layers.  L is  the  number  of  design-point  replications  or  the 
number  of  observations  in  cells.  Fig.  1.18  shows  the  tridimensional  arrangement  of 
columns,  rows  and  layers. 


Figure  1.18  Arrangement  of  data  in  columns,  rows  and  layers  to  represent  three-way  ANOVA 


84 


I Introduction  to  Statistics  for  Engineers 


In  calculation  for  the  three-way  analysis  without  replicating  L=l,  the  same  con- 
ventions in  the  notations  were  followed  as  for  one  and  two-way  analysis  of  variance. 


x...=EEE^; 

ijk  UK 

2 


Xu 


= EEE 


Vi  i 


ijk 


(1.137) 

(1.138) 


Presentation  of  definitions  and  procedures  for  the  three-way  analysis  of  variance, 
no  replications  of  the  design  points  L=l,  is  given  in  Table  1.26.  For  this  kind  of  anal- 
ysis with  replicating  L>1,  the  calculation  is  given  in  Table  1.27.  The  convention  on 
notations  were  followed  here: 


E.=EEEE4,;  X...= 


k l 


X'" 

UKL ’ 


xi...  = EEE  Xu* 

\j  k i 


(1.139) 


Example  1.31  [13] 

The  following  results  were  obtained  in  a two-factor,  two-level  experiment  involving  a 
study  of  the  effect  of  temperature,  time  on  the  percentage  yield  of  a reaction: 


Temperature  [°C] 

Time 

110 

115 

120 

X;.. 

lh 

5;  6 

9;  7 

10;  11 

48.0 

2h 

10.8 

11;  12 

13;  15 

69.0 

X.j.  = 29.0 

39.0 

49.0 

X...  = 117.0 

ZE^Jk  = 225.0 

395.0 

615.0 

£££4-  1235. 

Table  1.25  Analysis  of  variance 


Source  of  variation 

f 

SS 

MS 

F 

Ft 

Temperature 

2 

50.00 

25.00 

20.00 

F2;6;0.9S=5.14 

Time 

1 

36.75 

36.75 

29.40 

Fl;6;0.95=5.99 

Interaction 

2 

0.00 

0.00 

0.00 

F2;6;0.9S=5.14 

Error 

6 

7.50 

1.25 

- 

- 

Total 

11 

94.25 

- 

- 

- 

SST 


ssc 


EEE4 


IK 


K 

IJK 


= 1235.0- 


292  + 392  + 492 
2x2 


117. Q2 
2x3x2 


117. Q2 
2x3x2 


94.25 


50.0 


Source  of  variation  Degrees  of  Sum  of  squares  definition  Sum  of  squares  calculation  form  Mean  square  Test  statistic 

freedom 
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" w- 

J w- 


u 
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Table  1.27  Three-way  analysis  of  variance:  with  replications 
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CO  CO 

£ s 


co  co 

S 5 


co  co 

£ £ 


S3 


to  « 

s a 


to  i 


to 

S 


to  I 
to  ^ 


3KD 


# 

* 

2 J 

x ~l| 

°x' 

X 

x 1 

1 

V 

IX  "xT 

£ nx! 

W- 

" W- 

W~ 

X" 

w- 


xf 

w- 

w- 


£j 


x x 

w- 


x 

w- 


x 

W- 

w- 


w- 


K 

X 

u 

a 


x 

U 

a 


Source  of  variation  Degrees  of  Sum  of  squares  definition  Mean  square  Test  statistic 

freedom 
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E xL  2 
= _i xL 

JK  IJK 


2 2 
48  + 69 

3x2 


117. Q2 
2x3x2 


36.75 


E E E Xi99  E 2 

' j > A = 

K /K  IK  IJK 

ll2+182+...+212+282  482+692  292+392+492  117.02 

2 EE  EE  +2x3x2 


SSE  = SST  - SSC  - SSR  - SSCR  = 94.25  - 50.0  - 36.75  + 0.0  = 7.50 

The  results  clearly  show  that  temperature  and  time  at  95%  and  99%  confidence 
level  respectively  have  a significant  effect  on  production  yield.  It  is  also  clear  that 
there  is  no  interaction  between  temperature  and  time  factors.  Experimental  error  is 
small  with  respect  to  interaction  so  that  the  used  measurement  equipment  can  be 
considered  satisfactory. 

Example  1.32  [12] 

The  influence  of  glass  types  and  coatings  on  specific  conductivity  was  tested  in 
developing  cathode  tubes  for  TV  sets.  From  each  obtained  value  260mA  were  sub- 
tracted so  that  the  coded  values  are: 


Type  of  coating 


Type  of  glass 

A 

B 

C 

1 

4;  6;  5 

8;  10;  7 

2;  5;  6 

53.0 

2 

-6;-5;-4 

0;-4;-5 

-8;-7;-6 

-45.0 

O 

O 

II 

16.0 

-8.0 

X...  = 8.0 

= 154.0 

254.0 

214.0 

EEE4  = 622,0 

1=2;  J=3;K=3 


8 0 

SST  = 622.0 ' = 618.44 

T 2x3x3 


SSr  = 


0.02+16.02+(-8)2 


2x3x3 


= 53.33  - 3.56  = 49.77 


SSR 


532+(-45)2 

3x3 


8 0 

— ' = 537.11  - 3.56  = 533.55 

2x3x3 


S^cr  — 


152+(— 15)2+...+132+(-21)2 


- 537.11  - 53.33  + 3.56  = 1.79 


SSE=618.44-49. 77-533. 55-1. 79=33. 33 


Table  1.28  Analysis  of  variance 
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Source  of  variation 

f 

SS 

MS 

F 

Ft. 

Type  of  coating 

2 

49.77 

24.88 

8.95 

F2;12;0.95=3.89 

Type  of  glass 

1 

533.55 

533.55 

191.92 

Fl;12;0.95=4.75 

Interaction 

2 

1.79 

0.89 

0.32 

F2;12;0.95=3.89 

Error 

12 

33.33 

2.78 

- 

- 

Total 

17 

618.44 

- 

- 

- 

The  analysis  of  variance  clearly  shows  that  glass  type  has  a dominant  influence 
on  cathode  tube  conductivity.  The  effect  of  coating  type  is  also  significant,  while  in- 
teraction is  of  no  importance. 

Example  1.33  [10] 

A group  of  24  mice  was  randomly  divided  into  six  subgroups  of  four  and  each 
mouse  got  an  insulin  shot.  Taking  into  account  three  levels  of  insulin  doses  and  two 
procedures  of  preparing  insulin,  A and  B,  reduction  of  sugar  per  cent  in  the  blood 
of  mice  was  measured  some  time  after  injecting  insulin.  The  obtained  results  are 
shown  in  the  following  table: 

Table  1.29  Analysis  of  variance 


Doses 

2.29 

3.63 

5.57 

17 

64 

62 

21 

48 

72 

636 

A 

49 

34 

61 

Preparation 

54 

63 

91 

33 

41 

56 

37 

64 

62 

576 

B 

40 

34 

57 

16 

64 

72 

267 

412 

533 

1212  = X... 

10361 

22554 

36443 

69358 

j k 

SSt=69358-61206=8152  ; SSC=65640.25-61206=4434.25 
SSr=61356-61206=150.00  ; SSCR=188275-65640.25-61356+61206=72.75 
SSE=8152-4434.25-150-72. 75=3495 
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Table  1.30  Analysis  of  variance 


Source  of  varia- 
tion 

f 

SS 

MS 

F Ft 

Doses 

2 

4434.25 

2217.125 

11.419  l2:s:„.fi— 3.55 

Preparation 

1 

150.00 

150.000 

0.773  F1;lg;0  95=4.41 

Interaction 

2 

72.75 

36.375 

0.187  F2;18;0  95=3.55 

Error 

18 

3495.00 

194.167 

- 

Total 

23 

8152.00 

- 

- 

Thus,  we  can 

with  95%  confidence  level  say  that  reduction  of  sugar  percent  in  the 

blood  of  mice  is 

significantly  influenced 

only  by  the  size  of  insulin  dose. 

Example  1.34  [15] 

The  following  results  were  obtained  in  a 

three-way,  two-level  experiment  where  tem- 

perature  A,  pressure  B and 

catalyst  C effects  on  chemical  reaction  yield  were  ana- 

lyzed: 

Table  1.31  Analysi 

s of  variance 

Ai 

^2 

x„. 

Cl 

c2 

c, 

c2 

Bi 

58 

64 

71 

78 

271 

b2 

53 

63 

69 

81 

266 

Ex., 

111 

127 

140 

159 

X...  = 537 

EE  4 

6173 

8065 

9802 

12645 

EEE4  = 36685 

j k 

i j k 

XUm  = 58  + 64  + 71  + 78  = 271;  X2..  = 53  + 63  + 69  + 81  = 266; 
X...  = 537;X.2..  = 288369 

E X?..  = 2712  + 2662  = 144197;  E E E 4 = 36685 

i i j k 

EX2.  = (58  + 64  + 53  + 63)2+(71  + 78  + 69  + 81)2=  146045 
j 

E xlk  = (58  + 71  + 53  + 69)2+(64  + 63  + 78  + 81)2=  144797 
k 

EE  X2.  = (58  + 64)2+(71  + 78)2+(53  + 63)2+(69  + 81)2=  73041 

» J 

EEX2fe  = (58  + 71)2+(68  + 78)2+(53  + 69)2+(63  + 81)2=  72425 
i k 

EE  4 = (58  + 53)2+(64  + 63)2+(71  + 69)2+(78  + 81)2=  73331 
j k 
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SSr  = 


SSB  = 


ss,  = 


Ex.2. 

j 

xL 

146045 

IK 

IJK 

2x2 

Ex2. 

i 

X.2.. 

144197 

JK 

2x2 

Ex»t 

k 

x.2.. 

144797 

U 

2x2 

2x2x2 


= 465.1 


2x2x2 


= 3.1 


2x2x2 


= 153.1 


ee4  ex2.  e x2. 


SSm  — ' 


J 


X.2.. 


K JK 

73041  144197  146045  288369 


IK  +IJK 


2x2 


- + 


2x2  2x2x2 


= 6.1 


EE  Kjk  E x.j.  J2xl.k  , 

ssCI  = EJ A j + x*"- 

Ci  I IK  IJ  IJK 

73331  146045  144797  288369 

-+  , , , = 1.1 


2x2 


2x2  2x2x2 


ee4  e *L  e x2..k 

i k i k 


X2 

■ • • • 


SSn,  = 5 (-- 

J JK  IJ  IJK 

72425  144197  144797  288369 

= 1 = 10.1 

2 2x2  2x2  2x2x2 


SS, 


CRT.  — E E E Xijk 

* J ^ 


EE%  EEx,,  EE4 

> j j i fe 

K I J 

EEL  Ex2,  ex2., 


k V2 


/X 


IK 


U UK 


73041  73331  72425  144197  146045  144797  288369  „ _ 

= 36685  1 1 h — — : _ = 0.125 


2x2 


2x2 


2x2 


2x2x2 


This  is  an  example  of  three-way  analysis  of  variance  with  no  design-point  replica- 
tion. As  we  have  only  one  value  for  each  set  of  factors,  the  variance  or  the  mean 
square  within  the  cell  as  an  estimate  of  system  variance  cannot  be  calculated.  In  the 
lack  of  error  variance,  or  rather  Reproducibility  variance.  Interaction  of  a higher  order 
can  be  used  as  error  estimate  for  the  F-test.  Although  all  statisticians  do  not  agree 
with  this  approach,  the  three-way  interaction  variance  CxRxL  was  taken  as  the 
error  estimate  for  F-test.  The  tabular  results  show  that  only  the  effects  of  columns 
and  layers,  or  temperature  and  catalyst,  are  significant.  Pressure  and  interaction  are 
not  important  at  the  95%  confidence  level.  The  other  approach  in  estimating  repro- 
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ducibility  variance  is  to  join  several  interactions.  We  can  assume  that  all  four 
interac- 


Table  1.32  Analysis  of  variance 


Source  of  variation 

f 

ss 

MS 

F 

Ft 

Between  columns 

i 

465.1 

465.1 

3720 

Fl;l;0.95=161 

Between  rows 

i 

3.1 

3.1 

24.8 

Fl;l;0.95=161 

Between  layers 

i 

153.1 

153.1 

1223 

Fl;l;0.95=161 

Interaction  C x R 

i 

6.1 

6.1 

48.8 

Fl;l;0.95=161 

Interaction  C x L 

i 

1.1 

1.1 

8.0 

Fl;l;0.95=161 

Interaction  R x L 

i 

10.1 

10.1 

80 

Fl;l;0.95=161 

Interaction  C x R x L 

i 

0.125 

0.125 

- 

- 

tions  are  unimportant,  i.e.  we  can  join  their  degrees  of  freedom  and  sums  of  squares 
to  obtain  the  estimate  of  variance  error.  In  that  case  the  degree  of  freedom  is  k=4 
and  the  sum  of  squares  6.1+1.1+10.1+0.125=17.425;  so  that  MSE=4.28.  With  the 
Fi;4;o.95— 7-71  value,  the  mean  square  for  any  effect  must  exceed  the  value 
4.28  x 7.71  = 33.0;  so  that  the  effect  could  be  statistically  important  at  95%  confi- 
dence level.  This  proves  again  that  in  this  case  only  the  effects  of  temperature  and 
catalyst  are  statistically  significant. 

Example  1.35  [14] 

Transfer  of  mass  from  liquid  into  solid  particles  of  an  expanded  fountain-fluidized 
bed  is  developed  in  the  annulus  and  fountain  according  to  the  following  criterion 
equation 

Sh  = A x Re  a x Res  x 
where: 

Sh  is  Sherwood’ s-c riterion; 

ReA  and  Res-  Reynold’s  Number  - criteria  for  annulus  and  fountain,  and 
H/D-height  simplex  of  the  added  active  coal  bed  versus  column  diameter. 

By  varying  ReA,  Res  and  H/D,  we  obtained  the  values  of  Sherwood’s-c riterion  as 
shown  in  the  table.  By  applying  analysis  of  variance,  verify  the  given  criterion  equa- 
tion by  establishing  the  effects  of  the  observed  factors  on  the  basis  of  experimental 
values. 
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Table  1.33  Analysis  of  variance 


Resl 

ReS2 

8eAl 

^eA2 

ReAl 

^eA2 

(H0/D)1 

2.023 

2.235 

2.110 

2.123 

16.958 

2.024 

2.212 

2.112 

2.119 

[4.047] 

[4.447] 

[4.222] 

[4.242] 

(h0/d)2 

1.790 

1.897 

1.966 

1.983 

15.265 

1.782 

1.868 

1.988 

1.991 

[3.572] 

[3.765] 

[3.954] 

[3.974] 

E*.  , 

7.619 

8.212 

8.176 

8.216 

X....  = 32.223 

E E 

14.569 

16.976 

16.730 

16.894 

65.169 

j k l 

1=2;  J=2;  K=2;  L=2;  N=IJKb=16  ; 

EEEE4<  = 65.169;  x--  = 32.223;  xl.  = 1038.322; 
i j k l 

E E E E/u  = 130.335;  £ xf_  = 16.9582  + 15.2652  = 520.594; 

i j k i 


Ex«j..  = (7.619  + 8.212)2+(8.176  + 8.216)2=  519.319 

j 

Ex.2.it.  = (7.619 + 8.176)2+(8.212  + 8.216)2=  519.361 
k 

EE  Xj/..  = (4.047 + 4.447)2+(4.222  + 4.242)2+(3.572  + 3.765)2+(3.954 
f J +3.974)2  = 260.472 

EEi  = (4-047  + 4.222)2+(3.572  + 3.954)2+(4.447  + 4.242)2+(3.765 
f +3.974)2  = 260.408 


EEx.jit.  = (4.047 + 3.572)2+(4.447  + 3.765)2+(4.222  + 3.954)2+(4.242 
j k H-3.974)2  = 259.836 

ssT  = E E E E = 65.169  - = °-274 


EX.2.fc. 

k 

xL 

519.361 

- 64.895  = 

UL 

N 

2x2x2 

Ex.j.. 

_ j 

xL 

519.319 

32.2232 

IKL 

N 

2x2x2 

16 

Ex2 

i 

X2 

520.594 

- 64.895  = 

JKL 

E\T 

2x2x2 

= 0.020 
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SSrn  — 


SSrr  — 


SSrl  — 


EEx;. 

i j 

E XL. 

i 

KL 

260.472 

JKL 

520.594 

2x2 

2x2x2 

ee4 

J k 

E x2.j.. 

j 

IL 

259.836 

IKL 

519.319 

2x2 

2x2x2 

ee4 

i k 

Ex?... 

i 

JL 

260.408 

JKL 

520.594 

2x2 

2x2x2 

EE 

j 


2x2x2 


X.... 

IKL  N 


+ 64.895  = 0.024 


EE.t.  v2 

_k I 

IJL  N 

1C)  361 

+ 64.895  = 0.019 


2x2x2 
,2 


E4. 

k 


+ - 


2x2x2 
r2 


IJL  ' N 
161 

+ 64.895  = 0.003 


= EEEE xijki  - 

i j k l 


EE  Ex^. 

EJ_J = 65.169  - 130  335  = 0.001 


SScrl=SSt-SSc-SSr-SSl-SScr-SScl-SSrl-SSe 

=0.274-0.020-0.179-0.025-0.024-0.019-0.003-0.001=0.003 


Table  1.34  Analysis  of  variance 


Source  of  variation 

F 

SS 

MS 

F 

Ft 

Res  of  fountain 

1 

0.020 

0.020 

166.67 

Fl;8;0.95=5-32 

H0D  simplex 

1 

0.179 

0.179 

1491.67 

Fl;8;0.95=5-32 

ReA  annulus 

1 

0.025 

0.025 

208.33 

Fl;8;0.95=5-32 

Res  x H0/D 

1 

0.024 

0.024 

200.00 

Fl;8;0.95=5.32 

Res  x ReA 

1 

0.019 

0.019 

158.33 

Fl;8;0.95=5.32 

H0/D  x Rea 

1 

0.003 

0.003 

25.000 

Fl;8;0.95=3-32 

Res  x H0/D  x Rea 

1 

0.003 

0.003 

25.000 

Fl;8;0.95=3-32 

Error 

8 

0.001 

0.00012 

- 

- 

Total 

15 

0.274 

- 

- 

- 

It  is  of  interest  for  variance  methodology  that  complex  values  such  as  non  dimen- 
sional Reynold’s  numbers  and  simplex  were  used  as  factors  in  this  example.  Analy- 
sis of  variance  has  shown  that  all  factors  and  interactions  are  important,  which  veri- 
fies the  assumed  criterion  equation.  The  low  error  value  of  the  experiment  charac- 
terizes good  experiment  reproducibility  and  simultaneously  indicates  the  fact  that 
all  essential  factors  have  been  taken  into  consideration. 
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Example  1.36  [4] 

The  table  below  gives  the  results  of  an  experiment  as  niacin-contents  in  peeled  and 
processed  peas  of  three  granulations  (A,  B and  C),  after  three  kinds  of  preparations 
(Ri,  R2  and  R3). 


Table  1.35  Peeled  peas 


A 

B 

C 

R1 

R2 

R3 

R! 

R2 

r3 

Ri 

R2 

R3 

65 

90 

44 

59 

70 

83 

88 

80 

123 

87 

94 

92 

63 

65 

95 

60 

81 

95 

48 

86 

88 

81 

78 

88 

96 

105 

100 

28 

70 

75 

80 

85 

99 

87 

130 

131 

20 

78 

80 

76 

74 

81 

68 

122 

121 

22 

65 

70 

85 

73 

98 

75 

130 

115 

24 

75 

73 

64 

61 

95 

96 

121 

127 

47 

98 

88 

96 

71 

95 

98 

125 

99 

28 

95 

77 

91 

63 

90 

84 

172 

101 

42 

66 

72 

65 

54 

76 

82 

133 

111 

411 

817 

759 

760 

694 

900 

834 

1199 

1123 

Table  1.36 

Processed  peas 

A 

B 

C 

Ri 

R2 

R3 

R! 

R2 

r3 

Ri 

R2 

R3 

62 

106 

126 

150 

138 

150 

146 

52 

100 

113 

107 

193 

112 

120 

112 

172 

97 

133 

171 

79 

122 

136 

135 

126 

138 

112 

125 

135 

122 

115 

120 

126 

123 

124 

116 

124 

123 

125 

126 

118 

120 

125 

113 

121 

115 

132 

96 

110 

134 

132 

110 

121 

99 

122 

120 

111 

98 

125 

135 

125 

125 

120 

112 

117 

116 

115 

114 

124 

120 

116 

121 

116 

153 

124 

112 

112 

137 

110 

165 

122 

99 

132 

126 

109 

120 

125 

125 

137 

134 

105 

1258 

1112 

1226 

1241 

1292 

1226 

1357 

1094 

1151 

These  tables  give: 

1=2;  J=3;  K=3;  L=10;  N=I  x J x K x L=180. ; X....  = 18454  X2....  = 340  550  116 

EEEE4l  = 652  + 872  + 482  + ...  + 1162  + 992  + 1052  = 22  054  828; 
i j k l 
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Table  1.37 

Peeled  Peas 

Table  1.38 

Processed  Peas 

A 

B 

C 

A 

B 

C 

Ri 

1258 

1241 

1357 

R! 

411 

760 

834 

R2 

1112 

1292 

1094 

R2 

817 

694 

1199 

R3 

1226 

1226 

1151 

R3 

759 

900 

1123 

Sum 

3596 

3759 

3602 

Sum 

1987 

2354 

3156 

By  compiling  the  given  sums,  we  can  make  up  the  following  three  tables: 


Table  1.39  Accessory  Table  Table  1.40  Accessory  Table  Table  1.41  Accessory  Table 


Peeled 

Processed 

Sum 

Peeled 

Processed 

A 

B 

C 

Sum 

A 

1987 

3596 

5583 

R! 

2005 

3856 

Ri 

1669 

2001 

2191 

5861 

B 

2354 

3759 

6113 

R2 

2710 

3498 

R2 

1929 

1986 

2293 

6208 

C 

3156 

3602 

6758 

R3 

2782 

3603 

R3 

1985 

2126 

2274 

6385 

Sum 

7497 

10957 

The  following  arithmetic  is  necessary  for  analysis  of  variance: 

Y,xi~  = 74972  + 109572  = 176260858  ; 

i 

£X2..  = 55832  + 61132  + 67582  = 114  209  222 

j 

T,x?.k.  = 58612  + 62082  + 63852  = 113658810 
k 

EEX^..  = 19872  + 23542  + 31562  + 35962  + 37592  + 36022  = 59  485  522 
» J 

J2'Ex?.k.  = 20052  + 27102  + 27822  + 38562  + 34982  + 36032  = 59189998 

i k 

E E xljU  = 16692  + 19292  + 19852  + 20012  + 19862  + 21262  + 21912  + 22932 
j k +22742  = 38  144  306 

E E E xijk.  = 4112  + 7602  + 8342  + 8172  + 6942  + 11992  + 7592  + 9002 
‘ j k +11232  + 12582  + +12412  + 13572  + 11122  + 12922  + 10942 
+12262  + 12262  + 11512  =20  073  704 

SSt=2  054  828-1  891  945=162  883  ; SSC=1  958  454-1  891  945=66  509 
SSR=1  903  487-1  891  945=11  542  , SSL=1  894  314-1  891  945=2  369 
SSCR=1  982  851-1  958  454-1  903  487+1  891  945=12  855 
SSCL=1  973  000-1  958  454-1  894  314+1  891  945=12  177 

SSRL=1  907  215-1  903  487-1  894  314+1  891  945=1  359  ; SSE=2  054  828-2  007 
370=47  458 

SSCRL=2007370-1982851-1973000-1907215+1958454+1903487+1894314- 

1891945=8614 
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Table  1.42  Analysis  of  variance 


Source  of  variation 

f 

SS 

MS 

F 

Ft 

Between  columns 

1 

66509 

66509.00 

227.03 

Fl;162;0.95=3-84 

Between  rows 

2 

11542 

5771.00 

19.70 

F2;162;0.95=3-00 

Between  layers 

2 

2369 

1184.50 

4.04 

F2;162;0.95=3-00 

Interaction  C x R 

2 

12855 

6427.50 

21.94 

F2;162;0.95=TOO 

Interaction  C x L 

2 

12177 

6088.50 

20.78 

F2;162;0.95=T00 

Interaction  R x L 

4 

1359 

339.75 

1.16 

F4;162;0.95=2.37 

Interaction  C x R x L 

4 

8614 

2153.5 

7.35 

F4;162;0.95=2.37 

Error 

162 

47458 

292.95 

- 

- 

Total 

179 

162883 

909.96 

- 

- 

The  table  clearly  shows  that  all  factors  and  interactions  are  statistically  significant 
at  95%  confidence  level  except  for  the  second-order  interaction  R x L.  It  should  be 
noted  that  columns  refer  to  granulometric  contents  of  peas,  sorts  to  peeled  and  pro- 
cessed peas,  and  layers  to  preparation  kinds  (R1;  R2  and  R3). 

Example  1.37  [17] 

Analysis  of  variance  has  also  been  applied  in  testing  three  factors  of  composite 
rocket  propellant  burning  rate  in  Crawford’s-bomb  at  lOObar  and  20  °C  temperature. 
The  analyzed  factors  are:  contents  of  fine  fraction  in  bimodal  oxidizer  mixture  (C); 
contents  ratio  of  oxidizer  mixture-aluminum  powder  (B)  and  contents  of  burning 
rate  catalyst  (A).  The  obtained  burning  rate  values  are  given  in  the  table  in  mm/s: 


Table  1.43  Analysis  of  variance 


Ai 

a2 

E 

C, 

c2 

c, 

c2 

7.1 

7.5 

9.8 

15.6 

Bj  7.0 

7.0 

9.6 

15.9 

79.5 

[14.1] 

[14.5] 

[19.4] 

[31.5] 

7.8 

9.1 

10.6 

20.0 

B2  7.8 

9.4 

11.1 

20.8 

96.6 

[15.6] 

[18.5] 

[21.7] 

[40.8] 

29.7 

33.0 

41.1 

72.3 

176.1 

1=2;  J=2;  K=2;  L=2;  N=I  x J x K x L=16  ; 

EE  EE4  = 2250.09;  X....  = 176.1;  xL.  = 31011.21 

i j k l 

E^..  = 79.52  + 96.62  = 15  651.81 

i 

£X2..  = (29.7  + 33.0)2+(41.1  + 72.3)2=  16790.85 
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Ei.  = (29.7  + 41.1)2+(33.0  + 72.3)2=  16100.73 
fc 

E = (141  + 14.5)2+(19.4  + 31.5)2+(15.6  + 18.5)2+(21.7  + 40.8)2 

i j = 8477.83 

EEi  = (14-1  + 19.4)2+(15.6  + 21.7)2+(14.5  + 31.5)2+(18.5  + 40.8)2 
i * =8146.03 


EE4  = C14-1  + 15.6)2+(14.5  + 18.5)2+(19.4  + 21.7)2+(31.5  + 40.8)2 
J = 8887.59 

E EE4'  = 14-12  + 14-52  + 19-42  + 31-52  + 15-e2  + 18-52  + 21-72  + 40-s2 

i j k =4498.81 


ssT  = E E E e4i  - ^ = 2250-09  - 31011  21  = 311.89 

i j k l N 


16 


EE2.. 


SSr  = 


c IxKxL 


Ex2, 


SSR  =E 


X2 

N 


x.2... 


16790.85  31011.21 


8 16 
15651.81  31011.21 


SSr  = 


JxKxL 

EE2.,. 

j 

IxJxL 


N 


X.2... 


8 16 
16100.73  31011.21 


N 


8 


16 


= 160.66 


= 18.28 


= 74.39 


EE  4-  Ex2..  Ex2, 


— 


> J 


- + - 


V 2 


KxL  JxKxL  IxKxL  ' N 

8477.83  _ 15651.81  16790.85  31011.21 

2x2  2x2x2  2x2x2  16 

EEx2,.  Ex2,,  ex2. 


= 2.33 


— 


j k 


- + - 


V2 

\ • 


IxL  IxKxL  IxJxL  ' N 
8887.59  16790.85  16100.73  31011.21 


2x2 


2x2x2  2x2x2 


+ - 


16 


= 48.65 


EEX2,.  EX2..  EE2.,. 


SSB,  =-!- 


- + - 


V 2 


JxL  JxKxL  IxJxL  ' N 
8146.03  15651.81  16100.73  31011.21 


2x2x2  2x2x2 


+ - 


= 5.64 


2x2 


16 
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eee4- 

ss£  = E E E E - -EJ-i = 2250.09  - = 0.69 

i j k l 1 l 

SScrl  = SST  — SSC  — SS#  — SSL  — SSCR  — ssCL  — ssRL  — SSE 
= 311.89-160.66-18.28-74.39-2.33-48.65-5.64-0.69  = 1.25 


Table  1.44  Analysis  of  variance 


Source  of  variation 

f 

SS 

MS 

F 

Ft 

Factor  A 

1 

160.66 

160.66 

1868.14 

Fl;8;0.95=5-32 

Factor  B 

1 

18.28 

18.28 

212.56 

Fl;8;0.95=5-32 

Factor  C 

1 

74.39 

74.39 

865.00 

Fl;8;0.95=5-32 

Interaction  A x B 

1 

2.33 

2.33 

27.09 

Fl;8;0.95=5.32 

Interaction  A x C 

1 

48.65 

48.65 

565.70 

Fl;8;0.95=5-32 

Interaction  B x C 

1 

5.64 

5.64 

65.58 

Fl;8;0.95=5.32 

Interaction  A x B x C 

1 

1.25 

1.25 

14.53 

Fl;8;0.95=5-32 

Error 

8 

0.69 

0.086 

- 

- 

Total 

15 

311.89 

- 

- 

- 

The  mentioned  analysis  of  variance  clearly  shows  that  the  most  important  influ- 
ence (the  top  F-criterion  calculated  value)  on  the  burning  rate  of  composite  rocket 
propellant  comes  with  catalyst  content  change.  One  can  also  notice  the  high  F-criter- 
ion  value  of  the  interaction  A x C,  or  the  interactions  of  catalyst  and  fine  fraction 
contents  in  the  oxidizer  mixture.  As  the  significance  of  second-order  interactions 
means  that  the  power  of  influence  of  one  factor  on  response  depends  on  the  level 
the  other  factor  is  at,  we  can  conclude  that  the  efficiency  of  the  burning-rate  catalyst 
does  not  depend  only  on  its  mass  but  also  on  granulometric  contents  of  the  oxidizer. 
Small  experimental  error  indicates  that  its  reproducibility  is  very  good  and  that,  in 
general,  all  factors  influencing  the  composite  rocket  propellant  burning  rate  have 
been  taken  into  consideration. 
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Problem  1.26  [9] 

The  following  three  process  factors  were  tested  to  determine  the 
influence  of  conditions  in  the  technology  of  isatin  production: 


• factor  A-basic  raw  material  concentration; 

• factor  B-time  of  reaction; 

• factor  C-temperature  of  reaction. 


Table  1.45  Analysis  of  variance 


c 

■1 

C 

•2 

B, 

b2 

B, 

b2 

Ai 

6.08 

6.53 

6.79 

6.73 

6.31 

6.12 

6.77 

6.49 

A-2 

6.04 

6.43 

6.68 

6.08 

6.09 

6.36 

6.38 

6.23 

The  results  are  given  in  the  table.  By  applying  analysis  of  var- 
iance, select  the  factors  according  to  effects  of  their  influence  on  the 
measured  chemical  reaction  yield. 


Problem  1.27  [4] 

The  following  data  give  the  yields  of  a product  that  resulted  from 
trying  catalysts  from  four  different  suppliers  in  a process.  Deter- 


Catalyst 


I 

36 

33 

35 

34 

32 

34 

II 

35 

37 

36 

35 

37 

36 

III 

35 

39 

37 

38 

39 

38 

IV 

34 

31 

35 

32 

34 

33 

a)  Are  yields  influenced  by  catalysts? 

b)  What  are  your  recommendations  in  the  selection  of  a catalyst? 
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Problem  1.28  [4] 

Random  samples  of  size  ten  were  drawn  from  normal  populations 
A,  B,  C and  D.  The  measurements  of  property  X are  given  in  the 
table  below  along  with  totals  and  sums  of  squares  in  columns. 

Are  there  significant  differences  between  the  population  means? 
Table  1.46  Analysis  of  variance 


XA 

xB 

Xc 

xD 

3.355 

0.273 

3.539 

3.074 

1.086 

2.155 

2.929 

3.103 

2.367 

1.725 

3.025 

2.389 

0.248 

0.949 

4.097 

4.766 

1.694 

0.458 

2.236 

2.553 

1.546 

1.455 

3.256 

3.821 

1.266 

2.289 

3.374 

1.905 

0.713 

2.673 

1.781 

2.350 

0.000 

1.800 

2.566 

1.161 

3.406 

2.407 

2.510 

2.122 

15.681 

16.184 

29.313 

27.244 

x„ 

X 

1.5681 

1.6184 

2.9313 

2.7244 

37.071327 

32.339668 

90.081121 

83.620342 

j 

> j 

Problem  1.29 

Samples  of  steels  from  four  different  batches  were  analyzed  for  car- 
bon content.  The  results  are  shown  below  for  quadruplicate  determi- 
nations by  the  same  analyst.  Are  the  carbon  contents  (given  in 
weight  per  cent)  of  these  batches  the  same?  What  are  the  99%  con- 
fidence limits  on  the  average  carbon  content  of  each? 


Table  1.47  Analysis  of  variance 


Nol 

No2 

No3 

No4 

0.39 

0.36 

0.32 

0.43 

0.41 

0.35 

0.36 

0.39 

0.36 

0.35 

0.42 

0.38 

0.38 

0.37 

0.40 

0.41 

x-j 

1.54 

1.43 

1.50 

1.61 

X..  = 6.08 

x 

0.3850 

0.3575 

0.3750 

0.4025 

' 2 

0.5942 

0.5115 

0.5684 

0.6495 

ZErf  = 2.3236 

j 

* J 
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■ Problem  1.30 

The  torque  outputs  for  several  pneumatic  actuators  at  several  differ- 
ent supply  pressures  are  given  in  the  table  below. 

a)  Do  the  different  supply  pressures  significantly  affect  the  output? 

b)  Does  the  output  vary  for  the  different  models? 


Table  1.48  Analysis  of  variance 


Pi 

P 2 

P5 

A 

205 

270 

340 

B 

515 

700 

880 

C 

1775 

2450 

3100 

D 

7200 

9600 

12100 

Problem  1.31  [4] 

Results  of  a laboratory  analysis  for  the  specific  rate  constant  for  the 
saponification  of  ethyl  acetate  by  NaOH  at  0 [ °C]  are  given  below. 
One  of  the  runs  was  performed  with  an  electric  stirrer,  while  the 
others  were  hand  stirred.  The  results  were  analyzed  by  the  differen- 
tial and  integral  methods.  Does  the  electric  stirrer  make  a significant 
difference  to  the  results?  Does  the  method  of  analysis  of  data  make 
a significant  difference? 

Table  1.49  Analysis  of  variance 

With  stirrer  Without  stirrer 


Integral 

0.02058 

0.02121 

0.01849 

0.01816 

Method 

0.02214 

0.02073 

0.01951 

0.01884 

Differential 

0.01995 

0.02003 

0.01725 

0.01752 

Method 

0.01968 

0.01982 

0.01696 

0.01726 

Problem  1.32  [12] 

Observe  an  experiment  aimed  at  defining  the  factors  that  have  an 
essential  effect  on  energy  consumption  in  processing  metal  on  a 
lathe  with  ceramic  cutting  tools.  In  metal  processing,  in  fact,  the 
energy  is  measured  by  a dynamometer  and  it  is  proportional  to 
power  consumption.  Some  of  the  factors  influencing  the  observed 
response  are:  type  of  cutting  tools,  angle  of  cutting,  depth  of  cutting, 
speed  of  moving  the  metal,  type  of  cutting,  and  turning  speed  of 
axis.  In  this  experiment  the  cutting  depth  was  fixed  at  2.5mm,  speed 
of  moving  the  metal  at  0.3mm/min,  and  turning  speed  was  fixed  at 
lOOOmin’1. 


1.5  Analysis  of  Variance  | 103 

The  other  three  factors  were  varied  on  three  levels:  type  of  cutting 
tools,  angle  of  cutting  and  type  of  cutting.  The  experiment  data  after 
four  replications  are  given  in  the  table: 


Table  1.50  Analysis  of  variance 


Type  of  cutting 

Cutting  tools  T, 

Cutting  tools  T2 

Cutting  angle 
15(Bn) 

Cutting  angle 
30(B2) 

Cutting  angle 
15(6,) 

Cutting  angle 
30(B2) 

Continuously 

29.0 

28.5 

28.0 

29.5 

Cl 

26.5 

28.5 

28.5 

32.0 

30.5 

30.0 

28.0 

29.0 

27.0 

32.5 

25.0 

28.0 

With  interruption 

28.0 

27.0 

24.5 

27.5 

C2 

25.0 

29.0 

25.0 

28.0 

26.5 

27.5 

28.0 

27.0 

26.5 

27.5 

26.0 

26.0 

Problem  1.33  [10] 

In  developing  a procedure  for  bacteriological  testing  of  milk, 
samples  were  tested  in  an  apparatus  that  includes  two  components: 
bottles  and  kivets.  All  six  combinations  of  two  bottle  types  and  three 
kivet  types  were  tested  ten  times  for  each  sample.  The  table  contains 
data  on  the  number  of  positive  tests  in  each  of  ten  testings.  If  we 
remember  section  1.1.1  then  the  obtained  values  of  positive  tests  are 
a random  variable  with  the  binomial  distribution.  For  a correct 
application  of  the  analysis  of  variance  procedure,  the  results  should 
be  normally  distributed.  It  is  therefore  possible  to  transform  the 
obtained  results  by  means  of  arcsine  mathematical  transformation 
for  the  purpose  of  example  of  three-way  analysis  of  variance  with  no 
replications,  no  such  transformations  are  necessary.  The  experiment 
results  are  given  in  the  table: 
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Table  1.51  Analysis  of  variance 


Bottles 

1 

II 

Kivets 

A 

B 

c 

A 

B 

C 

Milk 

1 

1 

1 

1 

3 

2 

samples 

3 

4 

2 

2 

1 

3 

3 

2 

4 

3 

3 

6 

2 

4 

1 

1 

0 

0 

2 

1 

3 

2 

4 

6 

1 

1 

2 

0 

2 

1 

5 

5 

5 

3 

5 

5 

1 

1 

1 

0 

2 

0 

0 

1 

2 

2 

2 

2 

3 

4 

5 

1 

1 

3 

0 

0 

4 

0 

2 

1 

0 

1 

2 

0 

3 

1 

Problem  1.34  [21] 

Comparative  analysis  of  different  technological  solutions  for  produ- 
cing nitrate  compounds  was  done  in  lab  conditions.  Three  factors  of 
importance  for  nitrate  compounds  production  yield  were  tested: 


a)  Time  of  dosing  the  nitric  acid-A; 

b)  Time  of  mixing  the  reaction  mixture- B: 

c)  The  factor  of  mixture  remains  from  the  previous  batch-C. 


The  experimental  results  are  given  in  the  table: 


Table  1.52  Analysis  of  variance 


C, 

c2 

A, 

a2 

A, 

a2 

Br 

87.2 

88.4 

86.7 

89.2 

b2 

82.0 

83.0 

83.4 

83.7 
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Problem  1.35  [9] 

Paste  is  produced  in  a batch-type  technological  procedure.  Three 
samples  (A,  B,  and  C)  were  taken  from  each  batch  and  each  of  them 
tested  twice  for  tearing  strength.  The  obtained  results  and  their 
sums  are  in  the  table.  Do  the  analysis  of  variance. 


Table  1.53  Analysis  of  variance 


Batch 

Sample  A 

Sample  B 

Sample  C 

Sum 

Mean 

Sum 

Sum 

Sum 

1 

50.3 

49.8 

100.1 

50.1 

49.5 

99.6 

51.1 

49.4 

100.5 

300.2 

50.03 

2 

45.8 

45.4 

91.2 

44.4 

44.7 

89.1 

44.7 

44.4 

89.1 

269.4 

44.90 

3 

41.0 

41.4 

82.4 

42.7 

41.6 

84.3 

43.1 

43.3 

86.4 

253.1 

42.18 

4 

48.7 

50.0 

98.7 

48.0 

50.4 

98.4 

47.9 

48.5 

96.4 

293.5 

48.92 

5 

48.9 

49.4 

98.3 

48.4 

46.8 

95.2 

46.5 

45.4 

91.9 

285.4 

47.57 

6 

47.0 

46.1 

93.1 

47.4 

47.2 

94.6 

45.1 

47.5 

92.6 

280.3 

46.72 

7 

46.3 

45.0 

91.3 

44.6 

44.0 

88.6 

45.6 

44.2 

89.8 

269.7 

44.95 

8 

44.9 

42.3 

87.2 

45.1 

43.4 

88.5 

43.3 

41.6 

84.9 

260.6 

43.43 

9 

55.7 

55.4 

111.1 

56.3 

56.3 

112.6 

55.1 

55.0 

110.1 

333.8 

55.63 

Sum 

853.4 

850.9 

841.7 

2546.0 

- 

Mean 

47.4 

47.3 

46.8 

47.1 

- 

Problem  1.36  [9] 

The  table  shows  logarithms  of  CaCCh-chalk  filling  weights  from  two 
deposits  A and  B as  determined  in  1 1 laboratories  with  three  replica- 
tions. Use  analysis  of  variance  to  determine  whether  there  are  sig- 
nificant differences  between  labs  or  chalk  deposits. 


Table  1.54  Analysis  of  variance 


Labora- 

tory 

Chalk  A 

Chalk  B 

Mean 

Mean 

Mean 

1 

0.851 

0.851 

0.851 

0.851 

0.681 

0.686 

0.681 

0.683 

0.767 

2 

0.863 

0.866 

0.860 

0.863 

0.690 

0.690 

0.695 

0.692 

0.777 

3 

0.854 

0.854 

0.854 

0.854 

0.686 

0.686 

0.690 

0.687 

0.771 

4 

0.863 

0.869 

0.869 

0.867 

0.690 

0.690 

0.690 

0.690 

0.779 

5 

0.869 

0.872 

0.872 

0.871 

0.699 

0.703 

0.699 

0.700 

0.786 

6 

0.875 

0.869 

0.872 

0.872 

0.686 

0.690 

0.686 

0.687 

0.780 

7 

0.485 

0.857 

0.851 

0.851 

0.672 

0.672 

0.681 

0.675 

0.763 

8 

0.869 

0.869 

0.872 

0.870 

0.699 

0.695 

0.695 

0.696 

0.783 

9 

0.857 

0.857 

0.857 

0.857 

0.681 

0.681 

0.681 

0.681 

0.769 

10 

0.869 

0.881 

0.875 

0.875 

0.695 

0.690 

0.690 

0.692 

0.783 

11 

0.881 

0.881 

0.881 

0.881 

0.708 

0.708 

0.708 

0.708 

0.795 

Mean 

0.865 

0.690 

0.777 
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■ Problem  1.37  [9] 

Effects  on  physical  properties  of  vulcanized  rubber  were  analyzed  in 
a research  lab.  Based  on  research  results  the  influence  of  three  fac- 
tors on  the  strength  of  vulcanized  rubber  was  tested: 

a)  five  qualities  of  rubber  filler-  factor  A; 

b)  three  methods  of  previous  rubber  treatment-factor  B; 

c)  four  qualities  of  raw  rubber-factor  C. 

Do  the  analysis  of  variance. 

Table  1.55  Analysis  of  variance 

Factor  C 

Factor  B Factor  B Factor  B Factor  B 

Factor  A 404  478  530  381  429  528  316  376  390  423  482  550 

392  418  431  239  251  249  186  207  194  410  416  452 

348  381  460  327  372  482  290  315  350  383  376  496 

296  291  333  165  232  242  158  279  220  301  306  330 

186  198  125  129  157  197  105  163  190  213  200  255 


Problem  1.38  [9] 

In  a chemical  plant  nine  aluminum  alloys  were  tested  for  their  resis- 
tance to  corrosion.  Four  locations  were  chosen  inside  the  chemical 
plant  for  this  experiment  and  at  each  such  place  one  plate  of  all  nine 
alloys  was  placed.  Exposure  to  chemical  corrosion  lasted  one  year. 
After  the  experiment  four  researchers  examined  the  plates  randomly 
and  gave  a mark  from  1 to  10  to  each  plate  depending  on  the 
observed  resistance  to  corrosion.  The  experiment  was  aimed  at 
asserting  which  of  the  offered  aluminum  alloys  had  the  best  resis- 
tance to  corrosion  at  one  or  at  all  locations  of  the  chemical  plant.  It 
was  also  interesting  to  see  how  much  the  researchers  agreed  or  dis- 
agreed in  their  estimates  of  resistance  to  corrosion.  Thus,  the  experi- 
ment had  included  three  factors:  nine  aluminum  alloys,  four  loca- 
tions and  four  researchers.  Experimental  results  are  shown  in  the 
following  table: 
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Table  1.56  Analysis  of  variance 


Location 

Researcher 

Alloys 

Sum 

i 

II 

III 

IV 

V 

VI 

VII 

VIII 

IX 

I 

A 

5 

5 

5 

4 

6 

6 

1 

6 

7 

45 

B 

4 

5 

5 

4 

5 

3 

1 

5 

7 

39 

C 

7 

7 

7 

7 

8 

5 

4 

7 

7 

59 

D 

6 

5 

4 

5 

7 

6 

3 

6 

7 

49 

Sum 

22 

22 

21 

20 

26 

20 

9 

24 

28 

192 

II 

A 

8 

7 

7 

7 

5 

4 

5 

4 

5 

52 

B 

7 

8 

6 

7 

6 

5 

3 

7 

8 

57 

C 

9 

9 

9 

9 

8 

6 

7 

8 

8 

73 

D 

8 

8 

7 

7 

5 

5 

7 

4 

5 

56 

Sum 

32 

32 

29 

30 

24 

20 

22 

23 

26 

238 

III 

A 

4 

4 

5 

3 

4 

3 

0 

5 

5 

33 

B 

1 

3 

3 

2 

5 

2 

0 

4 

5 

25 

C 

5 

5 

5 

6 

6 

4 

3 

7 

9 

150 

D 

3 

3 

7 

2 

3 

3 

1 

6 

6 

34 

Sum 

13 

15 

20 

13 

18 

12 

4 

22 

25 

142 

IV 

A 

6 

5 

6 

5 

6 

4 

4 

7 

5 

48 

B 

1 

3 

6 

5 

5 

4 

3 

6 

5 

38 

C 

5 

5 

7 

6 

8 

7 

5 

8 

8 

59 

D 

5 

3 

5 

3 

5 

3 

3 

7 

6 

40 

Sum 

17 

16 

24 

19 

24 

18 

15 

28 

24 

185 

Grand 

mean 

84 

85 

94 

82 

92 

70 

50 

97 

103 

757 

Problem  1.39  [4] 

The  following  values  show  the  percentage  of  conversion  of  vinegar 
acid  into  anhydride  of  vinegar  acid  at  750  °C  by  catalytic  cracking  at 
different  percentages  of  triethyl  phosphate  (TEP)  as  catalyst.  The 
input  flow  of  material  was  changeable. 


a)  Does  the  catalyst  percentage  affect  conversion? 

b)  Does  the  input  flow  of  material  affect  conversion? 


Table  1.57  Analysis  of  variance 


Flow 

GAL/I 

Level  of  catalyst 

0.5%  TEF 

0.3%  TEF 

0.1%  TEF 

200.0 

77.85 

76.03 

73.87 

147.0 

89.12 

88.94 

87.65 

98.5 

99.09 

97.14 

91.78 

61.0 

99.55 

99.51 

97.60 
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Problem  1 .40  [4] 

In  an  air-water  contact  utilizing  a packed  tower,  the  water-air  contact 
is  occurring  whereby  the  gas  film  heat  transfer  coefficient  depends 
on  the  flow  of  liquid  and  gaseous  phases.  By  varying  these  two 
flows,  coefficient  values  of  heat  transfer  were  obtained  and  they  are 
presented  in  the  table: 

Table  1.58  Analysis  of  variance 


Gas  flow 
G 

Liquid  flow  L 

190 

250 

300 

400 

200 

200 

226 

240 

261 

400 

278 

312 

330 

381 

700 

369 

416 

462 

517 

1100 

500 

575 

645 

733 

a)  Perform  the  indicated  two-way  AHOVA  for  these  data.  Use  the  a=0.01  sig- 
nificance level. 

b)  Which  variable  has  the  greater  effect  on  the  heat  transfer  coefficient? 


Problem  1.41  [18] 

This  problem  presents  the  results  of  researching  strength  at  tear  of 
adhesive  systems  on  surfaces  with  no  primers  and  those  where  pri- 
mers were  applied.  A distinction  was  also  made  as  for  thickness  of 
the  sample.  Values  of  strength  at  tear  with  an  asterix  were  obtained 
from  samples  of  one  thickness  and  those  without  an  asterix  from 
samples  of  different  thickness. 


Apply  the  two-way  analysis  of  variance  to  values  with  an  asterix  so 
as  to  determine  the  significance  of  effects  of  different  adhesive  sys- 
tems and  previous  preparation  of  surface. 
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Table  1.59  Analysis  of  variance 


Adhesive  systems  Adhesive  systems 


i 

II 

III 

IV 

i 

II 

III 

IV 

w 

60.0* 

57.0* 

19.8* 

52.0* 

w 

59.0* 

51.0* 

29.4* 

49.0* 

i 

73.0 

52.0 

32.0 

77.0 

i 

78.0 

52.0 

37.8 

77.0 

t 

63.0* 

52.0* 

19.5* 

53.0* 

t 

48.0* 

44.0* 

32.2* 

59.0* 

h 

79.0 

56.0 

33.0 

78.0 

h 

72.0 

42.0 

36.7 

76.0 

57.0* 

55.0* 

19.7* 

44.0* 

51.0* 

42.0* 

37.1* 

55.0* 

p 

70.0 

57.0 

32.0 

70.0 

n 

72.0 

51.0 

35.4 

79.0 

r 

53.0* 

59.0* 

21.6* 

48.0* 

o 

49.0* 

54.0* 

31.5* 

54.0* 

i 

69.0 

58.0 

34.0 

74.0 

75.0 

47.0 

40.2 

78.0 

m 

56.0* 

56.0* 

21.1* 

48.0* 

P 

45.0* 

47.0* 

31.3* 

49.0* 

e 

78.0 

52.0 

31.0 

74.0 

r 

71.0 

57.0 

40.7 

79.0 

r 

57.0* 

54.0* 

19.3* 

53.0* 

i 

48.0* 

56.0* 

33.0* 

58.0* 

74.0 

53.0 

27.3 

81.0 

m. 

72.0 

45.0 

42.6 

79.0 

■ Problem  1.42 

| Repeat  Problem  1.41  but  for  values  with  no  asterix. 

■ Problem  1.43 

As  the  values  with  an  asterix  in  Problem  1.41  mean  that  they  were 
obtained  for  one  thickness  of  the  sample  and  those  with  no  asterix 
for  another  one,  apply  the  three-way  analysis  of  variance  for  all  the 
data  in  Problem  1.41  taking  into  account  the  new  sample  thickness 
factor. 

■ Problem  1.44 

In  a pilot-plant  for  producing  composite  rocket  propellants  12 
batches  were  produced,  by  which  seven  experimental  rocket  motors 
were  cast  and  cured  for  each  batch.  By  static  firing  of  these  motors 
at  25  °C  burning-rate  laws  as  a function  of  pressure  were  obtained. 
From  such  a law  a rate  value  at  70bar  pressure  was  obtained.  The 
propellant  batches  were  mixed  with  two  batches  of  ammonium  per- 
chlorate and  two  types  of  catalysts.  Besides,  each  batch  was  repeated 
three  times.  The  burning-rate  data  at  25  °C  and  P=70bar  for  all  mix- 
ing conditions  of  the  propellants  are  given  in  the  table  below.  By 
applying  the  two-way  analysis  of  variance  with  replication,  deter- 
mine with  95%  confidence  level  whether  the  catalyst  type  and  pre- 
paration of  10  |tm  of  ammonium  perchlorate  essentially  affect  the 
propellant  burning  rate  at  25  °C  and  70bar? 
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I Table  1.60  Analysis  of  variance 


Catalyst  A 

Catalyst  B 

Batch  AP  (A) 
Batch  AP  (B) 

14.199  14.197  14.193 
14.398  14.418  14.307 

15.716  15.612  15.682 
15.616  15.912  15.482 

Summary  of  analysis  of  variance 

Analysis  of  variance  is  a procedure  by  which  the  total  variance  is  divided  into  sources 
of  variations.  Depending  on  the  experiment  design  done,  it  is  possible  to  separate 
from  the  total  variance  a different  number  of  sources  of  variations.  However,  no 
matter  how  many  sources  of  variations  were  selected,  they  all  refer  both  to  those 
that  occur  under  the  influence  of  systematic  variations  and  to  the  error  resulting 
from  random  variations.  The  aim  of  applying  the  analysis  of  variance  method  is  to 
answer  the  question:  is  the  difference  between  the  obtained  response  means  for  the 
tested  factors  a result  of  the  influence  of  tested  factors  or  has  it  occurred  randomly. 

In  solving  the  problem  we  shall  start  from  the  null  hypothesis,  which  assumes 
that  the  response  means  for  the  tested  factors  are  the  same  and  that  the  existing 
differences  have  occurred  randomly. 

Hence  the  null  hypothesis  is: 

H0:X1=X2  = ...  = X; 

The  answer  to  the  question  is  obtained  by  either  accepting  or  rejecting  the  null 
hypothesis,  achieved  through  the  analysis  of  variance  method  and  performed  in  the 
following  way: 

• If  the  obtained  differences  of  the  response  means  for  the  varied  factors  are 
significant,  the  null  hypothesis  is  rejected  and  we  conclude  that  they  result 
from  the  influence  of  the  factor. 

• If  the  obtained  differences  are  not  important,  we  accept  the  null  hypothesis 
and  consider  the  existing  differences  as  random  ones. 

Practically,  this  is  what  it  looks  like: 

If  the  calculated  value  of  the  applied  statistical  test  is  lower  than  the  associated 
tabular  value  for  the  threshold  or  level  of  significance  a,  i.e.  the  confidence  is  (1-a)  x 
100%, the  influence  or  effect  of  the  factor  is  not  statistically  significant.  The  hypoth- 
esis is  accepted  and  we  conclude  that  the  differences  between  the  means  are  a con- 
sequence of  random  fluctuation  factors. 

If  the  calculated  value  of  the  applied  statistical  test  is  above  the  associated  tabular 
value  for  the  threshold  or  level  of  significance  a,  i.e.  the  confidence  is  (1-a)  x 100%, 
the  difference  or  effect  is  statistically  significant.  The  hypothesis  is  rejected  and  the 
conclusion  is  that  such  a difference  may  in  the  future  be  expected  in  (1-a)  x 100% 
cases  provided  the  experiment  is  done  under  identical  conditions.  Hence  only  in  a x 
100%  cases  can  a different  outcome  be  expected. 

To  enable  successful  application  of  analysis  of  variance,  certain  assumptions  that 
are  in  the  basis  of  models  and  conclusions  drawn  should  be  fulfilled.  One  of  the 
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basis  assumptions  is  additivity  of  effects  of  some  tests  and  model  components  in 
general.  The  next  assumption  is  that  e4j  error  has  a normal  distribution  for  each  test 
individually  and  for  the  experiment  as  a whole.  The  third  one  refers  to  the  statistical 
equality  or  homogeneity  of  test  variances.  The  given  assumptions  justify  the  use  of 
the  F-test. 


Barttlet’s  test  for  equality  of  variances 

The  analysis  of  variance  technique  for  testing  equality  of  means  is  a rather  robust 
procedure.  That  is,  when  the  assumption  of  normality  and  homogeneity  of  variances 
is  “slightly”  violated  the  F-test  remains  a good  procedure  to  use.  In  the  one-way 
model,  for  example,  with  an  equal  number  of  observations  per  column  it  has  been 
exhibited  that  the  F-test  is  not  significantly  effected.  However,  if  the  sample  size  var- 
ies across  columns,  then  the  validity  of  the  F-test  can  be  greatly  affected.  There  are 

2 2 2 

various  techniques  for  testing  the  equality  of  k variances  Oi , 02, ...,  o^.  We  discuss 
here  the  most  widely  used  technique,  Barttlet’s  %2  test  for  homogeneity  of  variances. 

Let  Si , S2 Sic,  be  k independent  sample  variances  corresponding  to  k normal 

populations  with  means  p,;  and  variance  O; , (i=1.2,...,k).  Suppose  nrl,  n2-l,...,n]j-l, 
are  the  respective  degrees  of  freedom.  Barttlet  [19]  proposed  the  statistic: 
k k 


2 

X = 


where: 


v2  = 


(InV)  E (»i  - 1)  ~ E ( ni  ~ l)lnSi 


E (ni  - !)Si 
>= 1 


1 / 

k 

/ 

E(«rl) 

\l 

1=1 

(1.140) 


(1.141) 


and  In  denotes  the  natural  logarithm.  The  denominator  in  Eq.  (1.140)  is  defined  by: 

k / k 


'=14- 


3(k—  1) 


E: 


. 1 n.  — l 

1=1  l 


- 1 / E (n,  - 1) 


i—1 


(1.142) 


It  can  be  shown  that  the  statistic  in  Eq.  (1.140)  has  an  approximate  ^-distribution 
with  k-1  degrees  of  freedom  when  used  as  a test  statistic  for  H0  : = ...  = Oj..  Giv- 

en k random  samples  of  sizes  n1;  n2,...,nk,  from  k independent  normal  populations 
the  statistic  y2  in  (1.140)  can  be  used  to  test  H0.  Recall  that  a sample  variance  is: 


S2  = ( E tf-nX2 

i=  1 


(n~  1) 


(1.143) 


If  all  the  samples  are  of  the  same  size,  say  n,  then  Barttlet’s  statistic  in  Eq.  (1.140) 
becomes: 


X =(»-!) 
where: 


k\nV  — E lnS;  ] /£ 

i—l 


(1.144) 


t = 1 + (1  4-  k)/[3k(n  - 1)] 
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The  connection  between  natural  and  decade  logarithms  gives: 


1 = 2.3026 (»  - 1)  ^fclogV  - J2  !°gs? j t 


If  npn,  and  i=l,  2,...,  k,  then: 


2 2 2 2 2 

The  rejection  region  for  testing  H0:  oi  = 02  = ...  = Oj.  is  x >-  x_a 


(1.45) 


(1.46) 


Example  1.38  [4] 

In  an  experiment  to  determine  the  effects  of  sample  size  and  amount  of  liquid  phase  on 
the  height  equivalent  to  a theoretical  plate  (HETP)  in  gas  chromatography,  it  was  neces- 
sary to  utilize  solid  support  material  from  different  batches.  It  was  therefore  imperative 
that  the  resulting  data  be  checked  for  homogeneity  prior  to  attempting  to  develop  any 
quantitative  expressions  regarding  the  effects  of  these  variables  on  HETP.  Several  sets 
of  data  points  were  selected  at  random  and  examined  using  Bartlett’s  test. 

In  particular,  a set  of  four  HETP  values  obtained  for  cyclohexane  for  a 4 pi  sample 
injected  into  a 40%  (3,  |3'-oxydipropionitrile  column  were: 

Xx=0.44;  Xi  = 0.1936 

X2=0.44;  X22  = 0.1936 

X3=0.40;  X3  = 0.1600 

X4=0.43;  X?  = 0.1849 

52  % = 1.71;  J2xi=°-7  321  : X = — = 0.4275;  X2  = 0.18275625 
Exf-nX2 

S 2 = « = 0-7321-4x0.18275625  = ^ x 1()-4 

n—1  3 

The  variance  of  10  cyclohexane  data  sets,  each  consisting  of  four  observations,  are 
thus  calculated  and  presented  below.  In  this  case  n^4;  and  HI,  2,...,10  and  k=10. 


Liquid  phase,  (S,  [i'-oxydipropionitrile  sf  logsf 


40%-4  ml  sample 

0.0003583 

-3.44575 

30%-8  ml  sample 

0.0002250 

-3.64782 

20%-10  ml  sample 

0.0002250 

-3.64782 

20%-4  ml  sample 

0.0000916 

-4.03810 

10%-4  ml  sample 

0.0000916 

-4.03810 

5%-10  ml  sample 

0.0003000 

-3.52288 

5%-2  ml  sample 

0.0002250 

-3.64782 

3%-8  ml  sample 

0.0002250 

-3.64782 

3%-6  ml  sample 

0.0003000 

-3.52288 

10%-2  ml  sample 

0.0002250 

-3.64782 

0.0022665  -36.80681 
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2 

The  computations  needed  to  calculate  x according  to  Eq.  (1.145)  are: 

V=0.0022665/10=0.00022665  logV=-3.64464 

k logV=10  (-3.64464)=-36.4464 
t = 1+11/30  x 3=1.1222 

X2=2.3026  x (4-l)[-36.4464-(-36.80681)]/l. 1222=2. 218 

2 

The  tabular  value  x9.0  975=19.0  is  considerably  higher  than  the  arithmetic  so  that 
with  a 95%  probability  one  can  assert  that  the  variances  are  equal.  A simpler  and 
more  frequently  used  test  for  equality  of  variances  is  Cochran’s  test. 

Cochran’s  test  for  equality  of  variances 

Cochran’s  test  on  equality  of  variances  is  relatively  simple  and 
brought  down  to  calculating: 

c2 

S;  max 

YW 

where: 

2 

S;  max  is  the  largest  variance  in  a sequence  of  tested  variances, 

52  sf  -the  sum  of  all  test  variances  in  the  experiment. 

i 

If  the  arithmetic  of  Cochran’s  test  is  below  the  tabular  value,  the  null  hypothesis 

that  variances  are  equal  is  accepted.  Tabular  values  are  in  Table  I and  are  determined 

for  the  associated  threshold  or  significance  level  a=0.05  and  degree  of  freedom: 

2 

fj^n-l  number  of  data  used  in  calculating  the  variance  S;  is  reduced  by  one,  or  in 
other  words  the  number  of  replicated  design  points  (trials)  is  reduced  by  one  and 
the  f2-number  of  variances  or  number  of  trials. 

Example  1.39 

Apply  Cochran’s  test  for  equality  of  variances  to  Example  1.38. 

The  previous  example  gives: 

sf  max=0.0003583  ; 52  S?  =0.0022665  ; f1=n-l=4-l=3  ; f2=10  ; a=0.05. 

i 

00.000383/0.0022665=0.1689  ; C3:10;o.95=0.3733 

Due  to  the  fact  that  the  calculated  value  is  smaller  than  the  tabular  value,  the  null 
hypothesis  that  variances  are  equal  is  accepted. 

Data  transformation 

By  transformation,  the  original  experimental  data  obtain  other  values  on  the  basis  of 
which  the  analysis  of  variance  is  done.  There  are  three  main  reasons  for  its  applica- 
tion: 

• to  achieve  equality  of  variances; 

• to  achieve  normality  of  data  distribution; 

• to  make  the  test  effects  additive. 


its  arithmetic  is 
(1.147) 
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F-distribution  is  influenced  only  by  test  additivity  which  has  been  mentioned  as 
the  third  reason  for  data  transformation.  It  is  good  that  all  three  assumptions  of 
analysis  of  variance  are  achieved  by  one  transformation.  Generally  speaking,  trans- 
formation will  be  applied  always  when  there  is  a link  between  the  test  means  in  the 
experiment  and  their  variances.  Due  to  such  a general  condition,  one  should  find 
the  corresponding  transformation.  We  shall  first  indicate  the  cases  when  no  form  of 
transformation  could  be  applied: 

• The  test  averages  are  approximately  equal  while  their  variances  are  heteroge- 
neous. 

• The  test  averages  are  independent  of  their  variances. 

• Variances  are  equal  but  distributions  are  heterogeneous. 

In  such  a case  the  solution  could  be  to  apply  one  of  the  nonparametric  statistical 
procedures.  Even  in  this  case,  nonparametric  statistics  is  characterized  by  less  rigor- 
ous assumptions  than  those  for  parametric  one,  and  it  is  relatively  less  efficient. 
There  are  different  forms  of  transformation.  We  shall  present  here  the  most  widely 
applied  ones  in  research  studies. 

Transformation  based  on  square  root  from  data  X'=\/X  is  applied  when  the  test 
values  and  variances  are  proportional  as  in  Poisson’s  distribution.  If  the  data  come 
from  counting  up  and  the  number  of  units  is  below  10  transformation  form 
XWX  + 1 and  \text  X'=s/X  + y/X  + T is  used.  If  the  test  averages  and  their  stan- 
dard deviations  are  approximately  proportional,  we  use  the  logarithm  transforma- 
tion X'=log  X.  If  there  are  data  with  low  values  or  they  have  a zero  value,  we  use 
X'=log  (X+l).  When  the  squares  of  arithmetical  averages  and  standard  deviations 
are  proportional  we  use  the  reciprocal  transformation:  X'=l/X  or  X'=1/(X+1)  if  the 
values  are  small  or  are  equal  to  zero.  The  transformation  arc  sin  y/X  is  used  when 
values  are  given  as  proportions  and  when  the  distribution  is  Binomial.  If  the  test 
value  of  the  experiment  is  zero  then  instead  of  it  we  take  the  value  l/(4n),  and  when 
it  is  1,  l-l/(4n)  is  taken  as  the  value  and  n is  the  number  of  values.  Transforming 
values  where  the  proportion  varies  between  0.30  and  0.70  is  practically  senseless. 
This  transformation  is  done  by  means  of  special  tables  suited  for  the  purpose. 

Dilemmas  often  occur  in  actual  cases  in  spite  of  such  clear  indications  on  which 
transformation  to  apply.  There  exist  several  formal  procedures  that  may  be  of  great 
help.  A relatively  simple  procedure  will  be  presented. 

The  calculation  consists  of  taking  in  different  transformations  at  each  trial  the 
highest  and  lowest  values  and  establishing  the  range  from  their  difference.  Then 
the  ratio  between  the  largest  and  smallest  ranges  is  established.  The  smallest  ratio 
in  the  given  transformation  indicates  the  transformation  form  to  be  applied.  As  an 
illustration  we  shall  take  the  experiment  [20]  with  one  factor  where  the  influence  of 
herbicides  on  weeds  is  researched  and  where  the  factor  is  varied  on  six  levels  and 
each  test  replicated  five  times. 


Table  1.61  Herbicide  influence  on  weeds  in  sugar  beet 
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Number  of 
replications 

Factor  levels 

Sum 

A 

B 

C 

D 

E 

F 

1 

520 

268 

162 

120 

142 

35 

1247 

2 

530 

270 

164 

121 

149 

30 

1264 

3 

530 

280 

152 

126 

140 

35 

1263 

4 

490 

269 

148 

111 

141 

24 

1183 

5 

457 

278 

158 

120 

149 

25 

1187 

Sum 

2527 

1365 

784 

598 

721 

149 

6144 

Table  1.62  Transformation 

on  basis 

of  largest  and 

smallest  trial 

values 

Factor 

Largest  and  smallest 
value  ratio 

A 

B 

C 

D 

E 

F 

Largest  value  (V) 

530 

280 

164 

126 

149 

35 

Smallest  value  (M) 

457 

268 

148 

111 

140 

24 

Interval 

73 

12 

16 

15 

9 

11 

73/9=8.11 

23.02 

16.73 

12.81 

11.22 

12.21 

5.92 

Vm 

21.38 

16.37 

12.17 

10.54 

11.83 

4.90 

Interval 

1.64 

0.36 

0.64 

0.68 

0.38 

1.02 

1.64/0.36=4.55 

l°g  V 

2.7243 

2.4472 

2.2148 

2.1004 

2.1732 

1.5441 

log  M 

2.6599 

2.4281 

2.1703 

2.0453 

2.1461 

1.3802 

Interval 

0.0644 

0.0191 

0.0445 

0.0551 

0.0271 

0.1639 

0.1639/0.0191=8.58 

l/V 

0.00186 

0.00357 

0.00609 

0.00794 

0.00671 

0.02857 

1/M 

0.00219 

0.00373 

0.00676 

0.00901 

0.00714 

0.04167 

Interval 

0.00033 

0.00016 

0.00067 

0.00107 

0.00043 

0.01310 

0.01310/0.00016=81.88 

l/x/V 

0.04344 

0.05977 

0.07806 

0.08913 

0.08190 

0.16892 

i/Vm 

0.04677 

0.06109 

0.08217 

0.09488 

0.08453 

0.20408 

Interval 

0.00333 

0.00132 

0.00411 

0.00575 

0.00263 

0.03516 

0.03516/0.00132=26.64 

This  procedure  indicates  that  the  transformation  is  the  square  root  from  data  and 
is  the  most  suitable  for  results  of  the  experiment  the  aim  of  which  was  to  find  out 
the  influence  of  herbicides  on  weeds  in  sugar  beet.  Before  and  after  the  application 
of  transformation  the  normality  of  the  data  distribution  should  be  checked. 

Check  of  the  null  hypothesis  on  normality  of  data  distribution 

If  the  distribution  of  data  or  random  value  is  unknown  we  can  make  up  a histogram 
from  them.  Intervals  associated  with  groups  of  random  values  within  the  same 
ranges  are  drawn  on  the  abscissa.  A rectangle  the  height  of  which  is  equal  to  the 
frequency  of  results  appearing  within  the  interval  is  drawn  above  each  range,  where 
ng  in  ng/N  is  the  number  of  values  within  the  observed  range.  The  following  algo- 
rithm for  drawing  up  the  histogram  may  be  suggested: 
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• The  region  of  random  value  changes  Xmin-Xmax  is  divided  into  £ ranges.  The 
number  of  e ranges  is  determined  by  the  formula: 

£=1+3.2  log  N (1.148) 

where: 

N is  the  number  of  data  in  a sample. 

The  range  width  is  calculated  from  equation: 

Ag  = (1.149) 

• The  number  of  data  ng(g=l,  2,...,  £)  within  the  range  Ag  or  the  frequency  of 
values  in  the  associated  interval  pg  is  determined: 

Pe  = Hl  (1-150) 

• Values  falling  within  the  g range  have  the  characteristic  mean  X : 

* xn  , -xa 

X = g2  (1-151) 

• Histogram  pg^  Xg_i-  Xg. 


After  drawing  up  the  histogram,  the  hypothesis  on  distribution  normality  of  the 
obtained  variation  sequence  is  checked.  The  check  is  done  by  means  of  the  criterion 
for  estimating  the  difference  between  theoretical  and  empirical  distributions.  The 
most  frequently  used  is  Pirson’s,  criterion,  determined  by  the  formula: 

. 2 


2 

XR  = 


E (»g -Npe) 


Nxp 


(1.152) 


where: 

£ is  the  number  of  ranges; 

N is  the  number  of  values  in  a random  sample; 

Ng  is  the  number  of  values  within  interval  g; 

pg  is  the  probability  of  values  falling  within  interval  g as  calculated  by  the  theoretical 
distribution. 

Pirson’s  criterion  has  f=£->t-l  degrees  of  freedom,  where  for  a normal  distribution 
h=2.  The  pg  probability  is  calculated  by  formula: 

Pe  = Fo(zg+i)  “ Fo(zg)  (1-153) 

where: 

zg  is  the  left  g-interval  limit,  or: 


Xc-X 


and  F(z)-Lap!ace's-function: 


(1.154) 
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The  values  of  the  normed  Laplace’s-function  are  in  Table  } (appendix).  When 
determining  the  given  values  Zg=zmin  is  replaced  by  and  Zg=zmax  by  +°°.  When 
the  calculated  value  of  the  Pirson  criterion  is  below  the  tabular  value  the  null 
hypothesis  on  normal  distribution  of  data  is  accepted: 

X*  -<  XTAB.  (1-156) 

Example  1.40 

In  a chemical  reactor  200  temperature  values  have  been  measured.  The  largest  and 
smallest  deviations  from  the  average  temperature  were  Xmax=30  and  Xmilj=-20.  The 
values  have  been  divided  into  10  ranges.  The  accuracy  of  the  measured  temperatures 
was  ±1  [°C].  By  using  Pirson’s  criterion  with  a=0.05  significance  level,  check  the  law 
on  data  distribution. 

The  mean  and  variance  estimates  are  obtained  from  expressions: 

10  10 

X=  ExgPg  = 4-30  [°C]  ; Sx  = E xsPg  ~x  = 94.20[  °C]  ; Sx=9.71  [ °C] 

g=i  g=i 


Table  1.63  Results  of  previous  calculations 


Number  of  ranges 

Limits  of  ranges 

No.  of  values  In  the 
ragne  ng 

Range  center 

X* 

s 

Relative  frequency 

Pg  = "*  !N 

1 

-20- -15 

7 

-17.5 

0.035 

2 

-15-10 

11 

-12.5 

0.055 

3 

-10 --5 

15 

-7.5 

0.075 

4 

-5-0 

24 

-2.5 

0.120 

5 

0-5 

49 

2.5 

0.245 

6 

5-10 

41 

7.5 

0.205 

7 

10-15 

26 

12.5 

0.130 

8 

15-20 

17 

17.5 

0.085 

9 

20-25 

7 

22.5 

0.035 

10 

25-30 

3 

27.5 

0.015 

From  Eq.  (1.154)  we  can  calculate  the  standardized  and  normal  values  of  a ran- 
dom value,  and  then  from  Table  J determine  the  F0(zg)  value  keeping  in  mind  that 
for  zg<0  F0(zg)=-F0(  I zg  | ).  We  can  then  form  Table  1.64  by  using  (1.153)  and  (1.152). 

Due  to  the  small  number  of  data  in  range  g=10  we  join  it  to  range  g=9.  The  num- 
ber of  degrees  of  freedom  is  therefore  reduced  by  one.  The  arithmetic  value  for  xR  is 
calculated  from  Eq.  (1.152): 

x\  = 7.09;/  = e — f— 1 = 9 — 2-1  = 6 

2 _ 

From  Table  D,  XTAB  — 12.6  is  obtained  for  f = 6 and  a=0.05,  so  that: 

Xr  = 7.06  -4  X2tab.  = 12.6 
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Table  1.64  Calculation  results 


g 

Zg 

Fo(zg) 

rg 

Nxp 

rg 

( ng-Nxpg )2 

Nx 

1 

— oo 

-0.5000 

0.0239 

4.78 

1.04 

2 

-1.99 

-0.4761 

0.0469 

9.38 

0.28 

3 

-1.47 

-0.4292 

0.0977 

19.54 

1.05 

4 

-0.96 

-0.3315 

0.1615 

32.30 

2.13 

5 

-0.44 

-0.1700 

0.1979 

39.58 

2.24 

6 

0.07 

0.0279 

0.1945 

38.90 

0.11 

7 

0.59 

0.2224 

0.1419 

28.38 

0.20 

8 

1.10 

0.3643 

0.0831 

16.62 

0.01 

9 

1.62 

0.4474 

0.0526 

10.52 

0.03 

10 

2.13 

0.4834 

- 

- 

- 

11 

+oo 

- 

- 

- 

- 

Hence  the  hypothesis  on  normal  distribution  of  measured  temperature  values  in 
the  chemical  reactor  is  accepted. 

Rejection  of  outliers 

Outliers  in  a random  sample  often  bring  about  wrong  conclusions  when  testing  and 
estimating  parameters.  Such  values  have  to  be  rejected  in  a sample  with  great  cau- 
tion for  this  other  extreme  may  affect  the  outcome  of  conclusions.  The  following 
procedure  is  suggested  in  reference  [21]: 

Amax=Xmax-X  (1.157) 

where: 

Xmax  is  the  sample  outlier. 

Estimate  the  value: 

|Amax|  y C x Sx  (1.158) 

where: 

C is  the  constant. 

The  outlier  Xmax  is  rejected  from  the  sample  if  the  inequality  (1.158)  is  fulfilled. 
The  procedure  may  be  repeated  several  times  whereby  standard  deviation  Sx  is  deter- 
mined from  the  remaining  sample  values  each  time.  The  constant  C is  determined 
from  the  t-Student’s  criterion  by  means  of  expression: 

0.5 

« Its  (1-159) 

where: 

f =N-1;  the  degree  of  freedom  of  variance  estimate  Sx ; 

f0  is  any  number  of  additional  degrees  of  freedom  (usually  f0=0). 
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Example  1.41 

The  following  values  of  a product  content  have  been  measured  by  gas  chromatogra- 
phy: Xj=23.2;  X2=23.4;  X3=23.5;  X4=24.1;  Xs=25.5.  Is  X5  the  outlier  and  can  it  be 
rejected  from  this  sample  of  values? 

The  calculated  sample  mean  when  X5  is  rejected  is:  X=23.55. 


Amax  = |Xmax— X|  = |25.5  - 23.55|  = 1.95;  N = 5;»  = N- 1 = 5 - 1 = 4 


Sx  = — 


E (X-X)2  E (X-23.55)2 


. >=1 


n—  1 


4-1 


= 0.15;  Sx  = 0.39 


For  the  significance  level  a=0.05;  N=5;  f=4  from  Eq.  (1.159)  we  get: 

0.5 


5 x C x 3 


.4— 5C 


= 2.35 


The  result  of  this  equation  is  C=1.44,  so  that  in  accord  with  Eq.  (1.158)  we  get: 

|Amax|  = 1.95  >-  C x Sx  = 1.44  x 0.39  = 0.56  and  the  hypothesis  that  X5  is  not 
the  outlier  is  rejected. 


Problem  1.45 

From  Table  A of  random  numbers,  150  double  digit  numbers  have 
been  chosen.  The  data  are  in  the  next  table.  Check  the  normality  of 
data  distribution  with  95%  confidence  level  by  using  Pirson’s  criter- 
ion. 


Table  1.65  Random  numbers 


Number  of  Interval  limits  Number  of  Frequency 

intervals  data  in  the  range 


1 

0-9 

16 

0.107 

2 

10-19 

15 

0.100 

3 

20-29 

19 

0.127 

4 

30-39 

13 

0.087 

5 

40-49 

14 

0.093 

6 

50-59 

19 

0.127 

7 

60-69 

14 

0.093 

8 

70-79 

11 

0.073 

9 

80-89 

13 

0.087 

10 

90-99 

16 

0.107 
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Problem  1.46 

Data  about  average  air  temperatures  for  320  days  are  given  in  the 
table.  Check  the  normality  of  the  temperature  values  by  using 
Pirson’s  criterion  with  5%  significance  level. 


Table  1.66  Air  temperatures 


Number  of 
Intervals 

Interval  limits 

Number  of 
data  in  the  range 

1 

-40  - -30 

5 

2 

-30  - -20 

11 

3 

-20- -10 

25 

4 

-10-0 

42 

5 

0-10 

88 

6 

10-20 

81 

7 

20-30 

36 

8 

30-40 

20 

9 

40-50 

8 

10 

50-60 

4 

Problem  1.47 

The  acidity  of  products  of  alcohol  chlorinating  has  been  measured 
by  a pH  meter.  The  following  values  have  been  obtained:  4.2;  4.4; 
4.0;  4.2;  4.4;  4.6;  4.4;  4.6;  4.4;  5.2;  4.8;  4.5;  4.2.  Is  the  value  pH=5.2 
the  outlier? 


Problem  1.48 

Four  batches  of  the  same  sort  of  composite  rocket  propellant  have 
been  cast  in  a lab.  After  curing,  the  experimental  rocket  motors 
were  static  fired  at  normal  temperature.  The  following  burning-rate 
values  at  P=70bar  pressure  have  been  obtained:  14.199  ; 14.531  ; 
14.197  ; 14.193.  Does  the  burning-rate  value  V=14.531mm/s  repre- 
sent the  outlier  with  95%  confidence  level? 


1.6 

Regression  analysis 

Regression  is  a highly  useful  statistical  technique  for  developing  a quantitative  rela- 
tionship between  a dependent  variable  (response)  and  one  or  more  independent 
variables  (factors).  It  utilizes  experimental  data  on  the  pertinent  variables  to  develop 
a numerical  relationship  showing  the  influence  of  the  independent  variables  on  a 
dependent  variable  of  the  system. 

Throughout  engineering,  regression  may  be  applied  to  correlating  data  in  a wide 
variety  of  problems  ranging  from  the  simple  correlation  of  physical  properties  to  the 
analysis  of  a complex  industrial  system.  For  example,  in  a catalytic  reactor  involving 
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a complex  chemical  reaction,  regression  methods  have  been  used  to  develop  an 
equation  relating  yield  of  desired  product  to  entering  concentrations,  temperature, 
pressure,  and  residence  time.  In  chemical  engineering,  regression  may  be  applied 
to  correlating  data  in  a wide  variety  of  problems,  ranging  from  the  simple  correla- 
tion of  physical  properties  to  the  analysis  of  a complex  industrial  reactor.  If  nothing 
is  known  from  theory  about  the  relationship  among  the  pertinent  variables,  a func- 
tion may  be  assumed  and  fitted  to  experimental  data  on  the  system. 

Frequently  a linear  function  is  assumed.  In  other  cases  where  a linear  function 
does  not  fit  the  experimental  data  properly,  the  engineer  might  try  a polynomial  or 
exponential  function.  In  a large  number  of  cases  theory  produces  incomplete  mod- 
els. Regression  analysis  is  used  in  such  cases  for  determining  unknown  coefficients 
in  a theoretical  equation  from  available  experimental  data.  The  theory  of  burning  a 
rocket  propellant,  for  instance,  supposes  that  the  linear  burning  rate  changes, 
depending  on  pressure,  in  this  way: 

V=b  pn  (1.160) 

In  this  case  experimental  data  will  be  used  for  determining  constants  b and  n by 
applying  regression  analysis. 

1.6.1 

Simple  Linear  Regression 

In  the  simplest  case  the  proposed  functional  relationship  between  two  variables  is: 

Y=|30+PiX+e  (1.161) 

In  this  model  Y is  the  dependent  variable,  X is  the  independent  variable,  and  £ is  a 

random  error  (or  residual)  which  is  the  amount  of  variation  in  Y not  accounted  for  by  the 

linear  relationship.  The  parameters  p0  and  Pi  are  called  the  regression  coefficients  that 

are  unknown  and  are  to  be  estimated.  The  variable  X is  not  a random  variable  but  takes 

on  fixed  values.  It  will  be  assumed  that  the  errors  £ are  independent  and  have  a nor- 

2 „ 

mal  distribution  with  mean  0 and  variance  a , regardless  of  what  fixed  value  of  X is 
being  considered.  Talcing  the  expectation  of  both  sides  of  Eq.  (1.161),  we  have: 

E(Y)=p0+PiX  (1.162) 

where  we  note  that  the  expected  value  of  the  errors  is  zero. 

For  a fixed  value  of  X,  the  expectation  in  Eq.  (1.162)  is  usually  denoted  by: 

E(Y)  = E(Y/X)  = nY/x 
Thus  we  can  write: 

E(Y)  = E(Y/X)  = py/x  = PQ  + PjX  (1.163) 

Eq.  (1.163)  is  called  the  regression  of  Y on  X.  The  only  random  variables  in  Eq. 
(1.161)  are  Y and  £.  Since  the  £ is  normally  distributed,  the  random  variable  Y has  a 
normal  distribution  with  mean  |J.y/x=Po+PiX  and  variance  a2'.  Geometric  interpreta- 
tion of  the  linear  regression  is  shown  in  Fig.  1.19. 


Figure  1.19  Geometric  interpretation  of  linear  regression 


In  order  to  estimate  the  relationship  between  Y and  X suppose  we  have  n observa- 
tions on  Y and  X,  denoted  by  (X^Yp,  (X2,Y2),  (Xn,Yn).  By  Eqs..  (1.161)  and  (1.163) 
we  can  write  the  assumed  relationship  between  Y and  X as: 

Y=E(Y/X)+e  (1.164) 

The  objective  here  is  to  estimate  p0,  and  Pi  and  thus  E(Y/X)  or  Y in  terms  of  the  n 
observations.  If  b0  and  bx  denote  estimates  of  p0  and  px  then  an  estimate  of  E(Y)  is 
denoted  by  Y = E(Y)  = b0  + btX. 

As  mentioned  before,  one  must  differentiate  population  parameters  and  sample 
estimates  of  population  parameters.  The  equation  that  describes  population  is  given 
by  expression  (1.162).  The  population  in  this  case  is  the  basis  for  a hypothetical 
physical  model  that  is  correctly  described  by  the  given  regression  (1.162).  We,  how- 
ever, take  a data  sample,  or  carry  out  the  experiment  presupposing  the  mathematical 
model  set  up  is  valid,  and  then  using  experimental  results  or  the  sample  data,  we 
calculate  b0  and  bx  as  estimates  of  population  parameters  po  and  px  . Thus  each  ob- 
served Y;  can  be  written  as: 

y.  = y.  + e.,  i=1.2,...,n  (1.165) 

where: 

Yj  is  the  estimate  Yx, 

e;  is  error  estimate  £x . 

The  linear  regression  may  be  written  as: 

YpPo+PiXi+Epbo+biXi+ei,  i=1.2,...,n  (1.166) 

The  Eq.  (1.166)  is  illustrated  in  Fig.  1.20. 

The  point  (X;,  Y;)  denotes  the  i-th  observation.  The  “true”  error  or  residual  is 
Yi-(Po+PiXi),  the  difference  between  the  observed  Yj  and  the  true  unknown  value 
Po+PiX;.  The  observed  residual  ex  is  Yi-(bo+biXi)=yi  — Y;,  which  is  the  difference  be- 
tween the  observed  Y;  and  the  estimated  Yi  = b0  + b1Xi.  The  problem  is  now  to 
obtain  estimates  b0  and  bx  from  the  sample  for  the  unknown  parameters  p0  and  Pi- 
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This  can  best  be  done  by  the  method  of  least  squares.  This  method  minimizes  the 
sum  of  squares  of  the  differences  between  the  predicted  values  and  the  experimental 
values  for  the  dependent  variable.  This  method  minimizes  the  sum  of  square  differ- 
ences between  estimates  and  experimental  values  of  dependent  variable-responses. 

The  method  is  based  on  the  principle  that  the  best  estimators  of  p0  and  Pi  are  those 
that  minimize  the  sum  of  squares  due  to  error,  SSE. 


Figure  1.20  True  and  estimated  regression  lines 


The  error  sum  of  squares  is: 


(1.167) 

n 

£(r,-&o- W 

i=l 

(1.168) 

To  determine  the  minimum,  the  partial  derivative  of  the  error  sum  of  squares 
with  respect  to  each  constant  is  set  equal  to  zero  to  yield: 


d(SSF) 

db0 


_d_ 

8b0 


n 


E (Yi  - K 


= 0 


(1.169) 


d(SSF) 

d \ 


_d_ 

dbx 


n 


E (Yi  - b0 


\Xi) 


= 0 


(1.170) 


These  equations  (1.169)  and  (1.170)  are  called  normal  equations.  Carrying  out  the 
differentiation,  we  obtain: 


nb0  + b1j:Xi  = EYi  (1.171) 

i i 

KEXi  + hZxf^EXiY, 

i i i 


(1.172) 
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where  all  the  summations  go  from  i=l  to  i=n.  The  solutions  to  these  normal  equa- 
tions are: 


b0  = Y-b1X 

£(X-X)(Y.-Y) 

b,  =-i ^ 

E (x-x) 

i 

The  estimator  bE  can  also  be  written  in  the  form: 


£XY-nXY 
= ZX?-nXz 


(1.173) 

(1.174) 


(1.175) 


The  error  sum  of  squares  can  be  written  as: 

sse  = E (X;  - Y)2-bi  52  (xi  - X)2  (1-176) 

i i 

The  first  term  on  the  right-hand  side  of  Eq.  (1.176)  is  the  total  corrected  sum  of 
squares  SSTC,  of  the  Ys.  The  linear  relationship  between  X and  Y accounts  for  a 
reduction  of  b\  (X;  — X)  in  SStc-  That  is,  if  there  is  no  linear  relationship  be- 
tween X and  Y,  ^=0,  then  (Y;  — Y)  = SSE.  If  there  is  a linear  relationship  be- 
tween X and  Y,  then  SSE  (or  SSTc)  is  reduced  by  an  amount  b\  (X;  — X)  , which 
is  called  the  sum  of  squares  due  to  regression  and  is  denoted  by  SSR.  Equation 
(1.176)  can  be  written  as: 

SSE=SSTC-SSR  (1.177) 

or: 

SST  = J2  Yi  = nY2  + SSE  + SSR  = SSM  + SSE  + SSR  (1.178) 

i 

Thus  regression  analysis  may  be  looked  upon  as  the  process  of  partitioning  the 
total  sum  of  squares,  SST  into  three  parts: 

(1)  The  sum  of  squares  due  to  the  mean-SSM; 

(2)  The  sum  of  squares  due  to  error  SSE  (or  deviation  about  the  regression  line); 

(3)  The  sum  of  squares  due  to  regression-SSR. 

Another  way  of  stating  this  result  is  that  each  Y;  value  is  made  up  of  three  parts 
(or  partitioned  into  three  segments),  each  one  leading  to  the  corresponding  sum  of 
squares.  That  is, 

Y,  = Y+  (y,  — y)  + (y,  -?,),»  = 1.2, ...,  n (1.179) 

Figure  1.21  shows  the  partition  of  Y in  graphical  form: 

It  should  be  noted  that  the  estimated  regression  line  always  passes  through  the 
point  (X,  Y) . This  is  obvious  from: 

Y = b0  + b1Xi  = Y — b1Xi  + blXi 


(1.180) 
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Figure  1.21  Partitioning  of  total  sum  of  squares  in  simple  linear 
regression 

It  can  be  shown  that  SSE/(n-2)  is  an  unbiased  estimator  of  a2.  Furthermore, 
SSE/cr2  has  a %2  distribution  with  n-2  degrees  of  freedom. 


1.6. 1.1  Interval  Estimation  in  Simple  Linear  Regression 

Prior  to  determining  confidence  intervals  or  determining  test  procedures  to  be  used, 
we  recall  three  assumptions  in  the  model  Y=(30+|31X  : 


The  independent  variable  X is  a fixed  variable  whose  values  can  be  observed 
without  error; 

For  any  given  value  of  X,  Y is  normally  distributed  with  mean  [iY/x=Po+PiX 
and  variance  a = a . 

Y/X  2 2 

That  the  variance  can  be  represented  as  a = which  is  the  same  for  each 
X.  The  estimator  for  a2,  as  previously  mentioned  is: 


S = MSe  = - -i- 


ft— 2 


e(v?.) 


ft— 2 


EE-  Eyi 


ft  — 2 


(1.181) 


For  a linear  regression  one  can  give  the  interval  estimate  of  parameter  p!  or  the 
slope  of  the  regression  line,  of  its  po  intercept  on  the  Y-axis,  of  the  true  mean  Y for 
any  value  X (pY/x=E(Y))  and  the  true  predicted  value,  Y;  corresponding  to  a fixed  val- 
ue of  X.  The  variances  of  the  estimators  of  these  parameters  can  be  shown  to  be: 


si  = 


E(*  x)  EX2  ,EX; 


(1.182) 


si  = 


1 X2 
n + £(X.-X)2 


x S2  = 


1 

b - 

ft 


Exf-E*, 


x S 


(1.183) 
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Y/X 


1+  (x-x)2 
n E (x-x)2 


x S'  = 


1 i (*"*)2 

n / 

Ex2- (e^ 


x S (1.184) 


si  = 

Y 


i+l+ 

n £(X-X) 


x S = 


1 + - + - 
n 


(x-x) 


Ex2-  Ex 


x S 


(1.185) 


For  a significance  level,  a,  the  (l-a)100%  confidence  interval  for  parameter  (Ij  is: 

^1  — Ej  X 2:1— a/2  Pj  + X t„_2;l-a/2  (1.186) 

For  parameter  (30: 

— E0  X in-2:l-a/2  ^ Po  ^ ^0  + S(,0  X t„_2;l-a/2  (1.187) 

With  (1-a)  x 100%  confidence  level,  the  confidence  interval  for  p,Y/x  is: 

X — X ^n-2;l-a/2  Py/x  ~<'  ^ + SU  x 2;l-a/2  (1.188) 

The  confidence  interval  for  any  Y value  at  the  corresponding  X value  and  signifi- 
cance level  a is: 

X - s?  X t„_2;1_a/2  + Y + y + S?  x t„_2;1_a/2  (1.189) 

It  should  be  noted  that  the  confidence  interval  for  pY/x  is  narrower  than  the  asso- 
ciated interval  for  Y,  for  the  latter  takes  into  account  the  variability  of  individual  Y 
values.  This  comes  from  the  fact  that: 

S2-  = Si,  + S2  (1.190) 

Y 11  y/x 

or  si.  y si,  . If,  after  calculating  the  confidence  interval  for  any  parameter,  it 

Y ^ Y/X 

turns  out  that  it  includes  the  value  null,  then  with  a (1-a)  100%  confidence  one  can 
assert  that  the  associated  parameter  is  not  essential,  i.e.  such  a parameter  is  left  out 
of  the  regression  equation. 


Example  1.42  [4] 

An  intermediate  step  in  a reaction  process  is  A— >B.  While  this  reaction  is  carried 
out  at  atmospheric  pressure,  the  temperature  varies  from  1 to  10  °C.  As  an  initial 
step  in  the  optimization  of  this  process,  the  relation  between  conversion  of  A and 
temperature  must  be  obtained.  Pilot-plant  studies  have  provided  the  following  data: 


X:  123456789  10 

Y:  3 5 7 10  11  14  15  17  20  21 
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where: 

X-temperature,  and 
Y-is  yield 

After  obtaining  the  regression  model,  calculate  the  confidence  intervals  for 
regression  parameters. 

Assume  that  the  model  Y=p0+PiX+e  may  describe  the  experimental  data  in  a sat- 
isfactory way.  This  time  from  experimental  data  we  calculate: 

EX,  =55.0;  E yi  = 123  0;  E^Y;  = 844.0 

EX'  = 385.0;  E Y!  1855.0:  (E^)  (E  Yi)  = 6765.0 

(EX;)2=  3025.0;  (E  X;)2=  15129.0;  n = 10 

X = ^^=5.5;  Y = Hi  =12.3 
n n 

E (x;  - x)  (y,  -y)=E¥r  EX*EY'  = 167.5 

E (X;  - X)2  = EX2  - = 82.5 

E (Y;  - Y)2  = E Y2  - ^ H = 342.1 

=H  (Xi~^)(Yi~i  _ 2.0303;  b0  = Y - b1X  = 1.1333 

E(x-x)2 

Y = 1.1333  + 2.0303  xX 


= E (X;  - Y)2  -biE  (Xf  - X)2  = 2.025 

i i 

s2  = ^e.  = 0.2531:  S=i?=  0.5031 

n—2 

a)  95%  confidence  interval  for  p0  according  to  Eqs.  (1.183)  and  (1.187): 


(\_  5.52\ 

lv10  + 82.5  J 


1/2 

xO.031  = 0.3437 


1.1333  ^8;o.975  x 6(,o  + p0  + 1.1333  + t8,0975  x Sh ° 

1.1333-2.306  x 0.3437<bo>l. 1333+2.306  x 0.3437 
0.3437<b0>1.9259 


b)  95%  confidence  interval  for  Pj  according  to  Eqs.  (1.182)  and  (1.186): 


S 


hi 


0.5031 

\/82.5 


0.05539 
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2.0303-2.306  x 0.05539<P!<2.0303+2.306  x 0.05539 
1 .9026<|31<2. 1 580 


c)  95%  confidence  interval  for  |ty/x  when  X=4  according  to  Eq.  (1.184)  and  (1.188): 


S_ 

v-r/x 


(4-5.5) 

82.5 


21  1/2 


xO.5031  = 0.1795 


Y = 1.1333  + 2.0303  x 4 = 9.2545 


9.2545-2.306  x 0.1795<|xY/x<9.2545+2.306  x 0.1795 
8.8406<pY/x<9.6684 

d)  95%  confidence  interval  for  Y when  X=4  according  to  Eqs.  (1.185)  and  (1.189): 

S„  = (1  + 0.1273)1/2  xO.5031  = 0.5341 

9.2545-2.306  x 0.5341<Y<9.2545+2.306  x 0.5341 
8.0229<Y<10.4861 


Example  1.43  [22] 

Monthly  consumption  of  water  steam  was  measured  in  a production  plant.  Simulta- 
neously the  monthly  average  temperature  of  atmosphere  was  taken.  The  obtained 
pairs  of  values  are  in  the  table: 

Table  1.67  Water  steam  consumption 


Number  of  data 

i 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

T [°F] 

35.3 

29.7 

30.8 

58.8 

61.4 

71.3 

74.4 

76.7 

70.7 

57.5 

46.4 

28.9 

28.1 

Steam  consumption 

10.98 

11.13 

12.51 

8.40 

9.27 

8.73 

6.36 

8.50 

7.82 

9.14 

8.24 

12.19 

11.88 

Number  of  data 

14 

15 

16 

17 

18 

19 

20 

21 

22 

23 

24 

25 

T["F] 

39.1 

46.8 

48.5 

59.3 

70.0 

70.0 

74.5 

72.1 

58.1 

44.6 

33.4 

28.6 

Steam  consumption 

9.57 

10.94 

9.58 

10.09 

8.11 

6.83 

8.88 

7.68 

8.47 

8.86 

10.36 

11.08 

Determine  coefficients  of  linear  regression  and  95%  confidence  intervals  for  all 
parameters. 

= 1315;  J2Yi=  235.60;  X = 52.60;  Y = 9.424; 


^XjY;  = 11821.4320;  £ X?  = 76323.42;  n = 25;  £ Yf  = 2284.1102; 

^^X.Y-^Xx&  11821.4320— 1315  x^^ 

= ‘ ' n = ^ - = -0.079829 


EX2_(M 


76323.42— 


b0  = Y-b1  x X = 9.424  - (-0.079829  x 52.60) 
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Y = 13.623005  - 0.079829  x X 

E (Y;  - y)2=  E Y?  - = 2284.1102  - 23^°  = 63.82 

E(Xi-X)2=EX;  - = 76323'42  _ = 7154.42 

SS£  = E (Y;  - Y)2  - b2  E (x;  - ^)2  = 63.82  - (— 0.079829)2  x 7154.42  = 18.23 

s2  = = 1S^1  = 0.7926;  S = 0.8903 

n—  2 23 


a)  95%  confidence  interval  for 


S_  = 
P 


0.8903 

/7154.42 


0.0105 


-0.079829-t23;o.975  x Sbi<Pi<-0.079829+t23;o,975  x Sbi 
-0.079829-2.07  x 0.0105<pl<-0.079829  + 2.07  x 0.0105 
-0.1015  <pl<-0.0581 


b)  95%  confidence  interval  for  p0 


/ 1 52.60  \ 

\E+  7154.42/ 


1/2 

xO.8903  = 0.3799 


13.623005-2.07  x 0.3799  ^<13.623005  + 2.07  x 0.3799 
12.836612  <P!<14.409398 


c)  95%  confidence  interval  for  the  response  mean  (tY/x  whenX  = X: 


= — + 0.0  xO.8903  = 0.1781 


Y=Y  = 9.424;  X = 52.60; 


9.424  - 2.07  x 0.1781  -<  (I y -<  9.424  + 2.07  x 0.1781  ; 

9.05531  -X  py/x=x  -<  9.7927 


The  standard  deviation  for  estimating  the  mean  Y has  a minimal  value  for  X = X, 
and  it  rises  when  X takes  values  on  either  side  of  X.  In  other  words,  the  farther  away 
X was  taken  from  the  mean  X,  the  greater  the  error  in  estimating  the  mean  Y for  a 
given  X value  may  be  expected.  It  is  obvious  if  we  take  the  following  X value: 

X=28.60;  =0.3091  ; 8.7842  -<  -<  10.0638 

%/x=x  Y,X 

Provided  we  continue  this  calculation  we  shall  obtain  its  geometric  interpretation 
as  shown  in  Figure  1.22. 

Both  the  figure  and  previous  calculation  show  that  the  width  of  95%  confidence 
interval  round  the  regression  line  changes  depending  on  the  value  X takes.  The 
obtained  limit  lines  of  the  95%  confidence  interval  are  hyperbolas. 
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Figure  1.22  Geometric  presentation  of95%  confidence  interval 


1 .6.1.2  Analysis  of  Variance  of  Simple  Linear  Regression 

We  have  previously  introduced  the  sum  of  squares  due  to  error  as  MSE=SSE/(n-2) 
and  said  that  it  is  the  unbiased  estimate  of  error  variance  a2  because  E(MSE)=a2  no 
matter  whether  the  null  hypothesis  H0:Pi=0  is  correct  or  not.  It  is  easy  to  prove  that 
the  expected  value  of  the  regression  mean  square,  MSR=SSR/1,  is  the  biased  esti- 
mate of  variance  a2  if  not  pi=0.  This  can  be  written  in  the  form: 

E(MSr)  = = O2  + p2  E (E  - X)2  y a (1.191) 

The  two  above-mentioned  expected  values  of  the  mean  squares  suggest  introduc- 
tion of  F-test  for  testing  the  null  hypothesis  H0:  pi=0. 


F = 


msr 

mse 


(1.192) 


The  given  ratio  of  mean  squares  due  to  regression  and  error  has  F-distribution 
with  f|=l  and  f2=n-2  degrees  of  freedom  if  the  null  hypothesis  H0:pi=0  is  correct. 
The  null  hypothesis  is  rejected  if  the  obtained  value  F>F1;n_2;i-a  is  above  the  tabular 
one.  Since  MSR  and  MSE  are  the  estimates  a2  provided  H0:pi=0,  then  the  null 
hypothesis  with  a significance  level  is  rejected  if  the  F-ratio  is  considerably  above  1 
for  E(MSr)>o2  when  H0:Pi=0  is  not  true. 

Analysis  of  variance  for  the  linear  regression  is  given  in  the  usual  way  in 
Table  1.68. 

Apart  from  the  mentioned  one,  the  analysis  of  variance  is  also  frequently  used 
where  the  total  sum  of  squares  SSt  is  corrected  for  the  mean  sum  of  squares  SSM. 
Such  a sum  of  squares  is  called  the  total  corrected  sum  of  squares  SSTc=SSrSSM. 
Analysis  of  variance  of  the  linear  regression  with  total  corrected  sum  of  squares  is 
given  in  Table  1.69. 
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Table  1.68  AOV  of  1 

inear 

regression 

Source  of  variation 

f 

SS 

MS  E(MS) 

Test 

statistic 

Average  (0) 

i 

nf2  = SSM 

MSm  — ssM  - 

- 

Regression  (px) 

i 

EE(E  -X)(Yi-Y)  = SSR 

MSr  = SSR  O2  + pj  E ixi  - 

-x)2 

msr 

mse 

Error  or  residual 

n-2 

E (e  - EJ  = SSE 

MSe  = SSE/(n  - 2)  a 

Total 

n 

E y2  = sst 

- 

- 

Table  1.69  AOV  of  linear  regression 


f 


SS 


MS  Test  statistic 


Regression  (Pj) 


1 ME  EE 


MSr  — SSR 


Error  or  residual  n-2 


E E~E 


mse  = 


ss„ 

(n-2) 


MS, 

MSe 


Total  corrected  sum  n-1 


Different  from  the  previous  one,  this  table  of  analysis  of  variance  introduces  dif- 
ferent expressions  for  sums  of  squares  that  are  more  suitable  for  calculations. 

Example  1.44 

Do  analysis  of  variance  for  the  regression  analysis  in  Example  1.42,  or  the  obtained 
linear  regression. 

The  calculated  value  of  analysis  of  variance  is  F=1343.6  for  the  null  hypothesis 
H0:|3;l=0.  However,  since  the  tabular  value  is  F1;g;0.95=5.32  the  null  hypothesis  is 
rejected  and  the  alternative  hypothesis  accepted  that  the  regression  coefficient  |3X 
with  95%  confidence  level  is  statistically  significant. 
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Table  1.70  Analysis  of  variance 


Source  of  variation 

f 

ss 

MS 

E(MS) 

Test  statistic 

Average  |30 

1 

1512.900 

1512.900 

2 -2 

340,075 

Regression  |3j 

1 

340.075 

340.075 

a + 82, 5 • (3[ 

= 1343,6 

0,2531 

Residual 

8 

2.025 

0.2531 

a 

Total 

10 

1855.000 

- 

- 

- 

Example  1.45 

Do  analysis  of  variance  for  the  linear  regression  obtained  from  experimental  value 
in  Example  1.43. 


Table  1.71  Analysis  of  variance 


Source  of  variation 

f 

SS 

MS 

F Ft 

Regression  pi 

i 

45.59 

45.59 

57.52  F1;23.0.9s=4.28 

Residual 

23 

18.23 

0.7926 

- 

Corrected  total 

24 

63.82 

- 

- 

Since  the  calculated  value  is  F=57.52>FTAb=4.28  it  can  be  with  95%  confidence 
level  asserted  that  the  regression  coefficient  (3,  is  statistically  greater  than  zero  and 
that  it  should  be  kept  in  the  linear  regression. 

1.6. 1.3  Lack  of  Fit  of  the  Simple  Linear  Regression4* 

The  test  for  lack  of  fit  of  the  regression  model  breaks  up  the  residual  sum  of  squares 
into  a sum  of  squares  for  lack  of  fit  and  an  experimental  error  sum  of  squares.  This 
can  be  done  only  if  we  have  some  values  of  X for  which  we  have  more  than  one 
value  for  Y.  This  concept  of  separating  a sum  of  squares  or  experimental  error  var- 
iance from  the  residual  sum  of  squares  or  residual  variance  has  been  known  from 
the  chapter  on  analysis  of  variance.  It  should  be  noted  that  replicating  the  trial  in 
this  case,  and  in  general,  does  not  mean  keeping  the  X value  in  the  experiment  con- 
stant and  reading  Y response  several  times,  but  a literal  taking  of  the  same  X value 
in  different  times  and  measuring  or  single  reading  of  Y each  time.  Thereby  experi- 
ments or  trials  are  done  between  such  times  for  other  X values.  Suppose  that  of  the 
n Xs  there  are  k distinct  Xs,  where  k<n,  which  occur  with  frequencies  n!,n2,  ...,n]j , 
where  n1+n2+...+ni=n.  The  sum  of  squares  of  the  n;,  Ys  corresponding  to  an  X;: 

ni  ni 

E (Yj  - y)2  = E yj  - nj2  = (SSE).  (1.193) 

j=1  J=1 


4)  Adequacy  of  linear  regression 
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is  a sum  of  squares  due  to  experimental  error.  The  pooled  sum  of  squares  due  to 
experimental  error  is  denoted  by: 

k 

SSE  = Z(SSE).  (1.194) 

1=1 

with  next  degrees  of  freedom: 

k 

Z(nt-l)=n-i  (1.195) 

i=  1 

The  mean  square  due  to  experimental  error  is: 

SS 

MSe= — \ (1.196) 

Yl  K 

The  usual  residual  sum  of  squares  is: 

n 2 

ssRES.  = e (yt  - y) 

t=i v 7 

with  (n-2)  degrees  of  freedom.  The  lack  of  fit  of  sum  of  squares  is: 

SSlf=SSres.-SSe  (1.197) 


with  n-2-(n-k)=k-2  degrees  of  freedom  or  the  mean  lack  of  fit  of  sum  of  squares  is: 

SS 

MSle  = ^ (1.198) 

The  critical  region  in  testing  H0:  Lack  of  fit  is  F)Fk_2  n_k  j_a  where: 

F - ,V,S'-:  (1.199) 

MS„  K 1 

E 

has  an  F-distribution  with  k-2  and  n-k  degrees  of  freedom. 

If  the  obtained  value  of  F-ratio  is  below  the  tabular  value  Fk.2;n.k;i-a  then  the  null 
hypothesis  that  the  linear  regression  is  adequate  to  (1-a)  100%  confidence  level  is 
accepted.  Hence  the  linear  regression  variance  analysis  should  also  include  check  of 
lack  of  fit  of  linear  regression.  If  in  variance  analysis  of  F-ratio  for  lack  of  fit  is  statis- 
tically: 

• Significant,  it  shows  that  the  obtained  linear  regression  does  not  adequately 
describe  the  experimental  results.  The  next  step  is  establishing  the  reason  for 
lack  of  fit. 

• Insignificant,  there  is  no  reason  to  doubt  lack  of  fit  of  the  obtained  regression 
model,  and  mean  sums  of  squares  of  experimental  error  and  lack  of  fit  can 
be  used  for  variance  estimate  a2. 

Some  of  the  standard  situations  in  doing  regression  analysis  are  given  in 
Fig.  1.23. 
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X 


X 


Example  1 

• Model  :Y  = po  + + £ 

• No  lack  of  fit 

• Significant  linear  regression 

• Use  model:  Y = b0  + b^X 


Example  2 

• Model:  Y = Po  + PtX  + e 

• No  lack  of  fit 

• Linear  regression  not  significant 

• Use  model:  Y=Y 


X 


Example  3 

• Model:  Y = PQ  + PtX  + e 

• Significant  lack  of  fit 

• Significant  linear  regression 

• Use  model: 

Y =\  + blX+bllXl 


Example  4 

• Model:  Y = po  + P;X  + e 

• Significant  lack  of  fit 

• Linear  regression  not  significant 

• Use  model: 


Y = b0  + brX  + bnX2 


X 

Figure  1.23  Typical  linear  regression  situations 
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Example  1.46  [22] 

To  demonstrate  analysis  of  variance  for  linear  regression  model,  the  following 
experimental  results  were  used: 

Table  1.72  Experimental  results 


No 

X 

Y 

No 

X 

Y 

No 

X 

Y 

1 

1.3 

2.3 

9 

3.7 

1.7 

17 

5.3 

3.5 

2 

1.3 

1.8 

10 

4.0 

2.8 

18 

5.3 

2.8 

3 

2.0 

2.8 

11 

4.0 

2.8 

19 

5.3 

2.1 

4 

2.0 

1.5 

12 

4.0 

2.2 

20 

5.7 

3.4 

5 

2.7 

2.2 

13 

4.7 

5.4 

21 

6.0 

3.2 

6 

3.3 

3.8 

14 

4.7 

3.2 

22 

6.0 

3.0 

7 

3.3 

1.8 

15 

4.7 

1.9 

23 

6.3 

3.0 

8 

3.7 

3.7 

16 

5.0 

1.8 

24 

6.7 

5.9 

After  calculating  the  given  experimental  data  the  following  linear  regression  is 
obtained:  Y = 1.436  + 0.338X.  Do  the  analysis  of  variance  by  determing  the  lack  of 
fit  of  the  regression  model. 

Analysis  of  variance  with  no  setting  apart  of  the  residual  variance  into  experimen- 
tal error  and  lack  of  fit  variance  is  given  in  the  table: 


Table  1.73  Analysis  of  variance 


Source  of  variation 

f 

SS 

MS 

F Ft 

Regression 

i 

6.326 

6.326 

6.569  Fl;22;0.95=4-30 

Residual 

22 

21.192 

0.963 

- 

Corrected  total 

23 

27.518 

- 

- 

The  procedure  of  calculating  the  experimental  error  variance  or  the  so-called  pure 
error  consists  of: 

SE  for  X=1.3  is: 


(SSE)X=1  = £ Yj  - nx=13  x Y2  = 2.32  + 1.82  - 2 
J=i 

= 0.5(2.3-1.8)2=  0.125 


2.3+ 1.8 


with  degrees  of  freedom  f =n-k=2-l=l; 
SSE  for  X=4.7  is: 


( SSe)x=47=  £ Yj  - nx=47  x Y2  = 5.42  + 3.22  + 1.92  - 3 
j=  i 

= 6.26 


5. 4+3. 2+1. 9 


with  degree  of  freedom  f =n-k=3-l=2 
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After  continuing  such  calculations  for  other  replicated  trials  we  obtain: 

Table  1.74  Analysis  of  variance 


Value  X 

rif  / , 2 Degrees  of  freedom  f 

J 

1.3 

0.125 

1 

2.0 

0.845 

1 

3.3 

2.000 

1 

3.7 

2.000 

1 

4.0 

0.240 

2 

4.7 

6.260 

2 

5.3 

0.980 

2 

6.0 

0.020 

1 

Total 

12.470 

11 

Now  that  the  sum  of  squares  of  experimental  error  SSE  =12.470  with  the  degree 
of  freedom  f =11;  has  been  calculated,  we  can  offer  the  analysis  of  variance  table 
with  the  lack  of  fit  variance: 


Table  1.75  Analysis  of  variance 


Source  of  variance 

f 

SS 

MS 

F 

Ft 

Regression 

i 

6.326 

6,326 

6326  = 6.569 

Fl;22;0.95=4-30 

Residual 

22 

21.192 

0.963 

0.963 

- 

Corrected  total 

23 

27.518 

- 

- 

- 

Lack  of  fit 

11 

8.722 

0.793 

0.793 

= 0.699 

1.134 

F ll;ll;0.95=2-82 

Pure  error 

11 

12.470 

1.134 

- 

Due  to  the  fact  that  the  calculated  value  of  F=0.699<F11;11;0.95=2.82  there  is  no  rea- 
son to  doubt  the  lack  of  fit  of  the  linear  regression. 

1.6.2 

Multiple  Regression 

Multiple  regression  can  be  used  to  develop  a quantitative  equation  relating  a depen- 
dent variable  with  several  independent  variables.  In  multiple  linear  regression,  any 
number  of  independent  variables  may  be  considered: 

y — Pq  T + fijXi  4-  ...  + |3^Xp  d-  £ (1.200) 

Assumptions  of  a multiple  regression  analysis  are  identical  to  those  for  linear 
regression  except  for  the  p independent  variables  in  this  case.  To  reach  regression 
coefficient  estimates  b;  by  the  method  of  least  squares,  we  again  have  to  minimize 
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the  sum  of  squares  due  to  errors.  Similarly,  as  in  linear  regression,  we  have  n data 
for  Y and  X1  ,X2  ,...Xp  , therefore  sum  of  squares  of  errors: 

ssE  = E 4 = E (yi  - Yi)2=  E (y;  -bo-  \xu  - b2x2i  - ...  - fcpxpi)  (1.201) 

i i i 

The  sum  of  squared  errors  between  the  observed  value  and  predicted  value  is 
minimized  by  taking  the  partial  derivative  (SSE)/b;  with  respect  to  each  parameter 
and  setting  each  result  equal  to  zero: 

nb0  + NE xu  + b2  E X2i  + + bp  E xpi  = E Y; 

K Exu  + b\  E^ii  + b2  J2xiixn  + •••  + bpJ2xi ixPi  = ExuYi 

bo  E X2i  +^lE  XliX2i  + ^2  E X2i  + •••  + bp  E X2iXpi  = E X2i  Yi 
bo  E xpi  ff’iE  xi;xpi  + ^2  E X2ixpi  + — + bp  E = E Xpi  E 

here  the  values  are  summed  for  all  i=l,...,  n data  points.  To  obtain  regression  coeffi- 
cient estimates  b0,  bE,  ...  , bp  it  is  necessary  to  solve  the  given  simultaneous  system 
of  linear  equations.  The  simplest  way  for  this  is  to  use  matrix  algebra  or  digital  com- 
puters. In  the  case  of  p independent  variables  the  sum  of  squares  due  to  error  is: 

sse  = E (Y;  - Y;)2 

i 

where: 


Yi  — b0  + b1X1  + ...  + bpXp 
It  is  easy  to  prove: 

sse  = E (Y;  - y;)2=  E (Yi  - E)2- 

i i 

+■■•  + bpi  E (xPi  - xp)  {Yi  ~ Yi)] 

i 

where: 


*»iE(Xu-Xi)(y,-y() 


(1.202) 


" X.. 

xj  = E — • 

J i=i  n 

The  sum  that  is  in  the  central  parenthesis  is  the  sum  of  squares  due  to  regression 
SSR  so  that: 

SSe=SStc-SSr;  SStc=SSe+SSr 
where: 

ssR  = b,E  (xu  - xi ) (y,  - y)  + ...  + bp  e (xpi  - xP)  (y,  - Y) 


(1.203) 
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It  is  of  particular  importance  to  do  statistical  testing  of  the  null  hypothesis: 

H0:  b1=b2=...=bp=0 

Testing  this  null  hypothesis  is  done  by  analysis  of  variance  of  multiple  regression 
shown  in  Table  1.76.  It  should  also  be  noted  that  the  unbiased  estimate  a2  is  given 
as: 


MSe 


E(y-y,)2 

n—p—l 


Table  1.76  Analysis  of  varaince  of  multiple  regression 


Source  of  variation 

f 

SS 

MS 

Test  statistic 

Regression 

p 

SSR 

msr=ssr/p 

MSr/MSe 

Error-residual 

n-p-l 

SSE 

MSe=SSe/ (n-p-l) 

- 

Total 

n-1 

SSTC 

- 

- 

The  test  statistic  is  the  F=MSR/MSE  ratio  which  is  compared  to  the  tabular  value 
Fp.(n-p-i);i-a-  If:  F>FP;(n-p-i);i-a  the  null  hypothesis  is  rejected  and  the  alternative  one 
accepted. 

Example  1.47  [4] 

It  is  necessary  to  relate  the  per  cent  gas  absorbed  in  a tower,  Y,  to  the  gas  tempera- 
ture Xj  and  the  vapor  pressure  of  the  absorbing  liquid,  X2  . The  following  data  are 
available: 


Table  1.77  Experimental  values 


Y 

Xr 

X2 

6.0 

113.5 

3.2 

10.0 

130.0 

4.8 

20.0 

154.0 

8.4 

30.0 

169.0 

12.0 

50.0 

187.0 

18.5 

80.0 

206.0 

27.5 

100.0 

214.0 

32.0 

The  postulated  model  is  Y=po+|31X1+p2X2+E.  Determine  regression  coefficients 
and  do  analysis  of  variance  of  multiple  regression.  From  tabulated  data  the  following 
values  are  calculated: 

Exii  = 1251.5;  £X2>  = 107.9;  £Xi;X2i  = 20359.0; 

Exu  = 211344.25;  £X2i  = 2371.339;  ExuExi>  = 134411.0; 
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(Exi;)2=  1566252.25;  (E  X2if=  11534.75;  E = 297E 
J2xuYi  = 57478.0;  EEE  = 6971.70;  E Yf  = 20338.25; 

EE,  E^  = 372321.25;  EE;  E = 31951.49; 

E = 156.4375;  X2  = 13.4249;  Y = 37.1875 
E (E;  - E)2=  15562.75;  E (E;  - E f=  929.4951; 
E(E,-E)(E;-E)  = 3557.914 

E (E;  - E)(y,  ~Y)=  10937.84;  E (E;  - E)  {Yi  - Y)  = 2927.762; 

E (Y;  - Y;)2=  9274.968 

Regression  coefficients  are  obtained  by  solving  the  system  of  equations; 

f 8 b0  + 1251. 5fej  + 107.9fo2  = 297.5 
^ 1251. 5fe0  + 211344.251)!  + 134411.001)2  = 57478.0  . 

[ 107.9fo0  + 134411.001)!  + 11534.75b2  = 31951.49 
bi=-0. 13840  ; b2=3.6796  ; b0=9.4398  ; 

Linear  regression  has  this  form: 

Y = 9.4398  - 0.1384  xXj  + 3.6796  x X2 

To  analyze  the  variance  we  calculate  the  sum  of  squares  due  to  regression. 

ssR  = bt  E (E,  - E)  (y,  - y)  + b2  E (E;  - E)  (y,  - ?) 

=-0.1384  x 10937.84+3.6796  x 2927.762=9259.196 
and  the  sum  of  squares  due  to  error: 

SSB  = E (y*  - y)2  = E (Yt  - y)2-SSr  =9274.968  - 9259.196  = 15.772 


Table  1.78  Analysis  of  variance 


Source  of  variation 

f 

SS 

MS 

F Ft 

Regression 

2 

9259.196 

4629.598 

1467.850  F2;5;0  95=5.79 

Residual-error 

5 

15.772 

3.154 

- 

Total 

7 

9274.968 

- 

- 

Based  on  analysis  of  variance  with  95%  confidence  level  the  null  hypothesis:  H0: 
Pi=|32=0  is  rejected,  for  the  calculated  value  F =4629.598/3.154  = 1467.850  is  consid- 
erably above  the  tabular  value  F2.5;o.95=5.79. 
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1.6.3 

Polynomial  Regression 

In  the  case  of  polynomial  or  curvilinear  regression,  as  given  by  the  model: 

Y=Po+PiX+p2X2+|3pXp+...+e  (1.204) 

there  is  only  one  independent  variable:  X.  However,  p-1  other  independent  variables 
are  defined  as  powers  of  X.  This  kind  of  p-degree  polynomial  regression  may  be 
brought  down  to  multiple  linear  regression  by  introducing  these  changes:  Wi=X; 
W2=X2;...  ;Wp=Xp.  In  a multiple  regression  with  independent  variables  Wp,  we  deter- 
mine the  regression  coefficients  by  the  procedure  given  in  the  previous  section.  The 
polynomial  regression  is  reduced  to  the  multiple  linear  regression  model  given  by 
Eq.  (1.204)  with  independent  variables  Wi,  W2,  ...,Wp. 

As  an  example  consider  the  quadratic  model: 

Y=po+plX+p2X2+e  (1.205) 

In  this  case  Wi=X  and  W2=X2  and  the  normal  equations  are,  given  n observations 
on  X and  Y: 

' nb0  + b,  E W1(  + b2  £ W2i  = £ Yj 

i i i 

< K E + bt  E w?(  + b2  E wuw2i  = E wuYt 

i i i i 

K E w2i  + K E w2i  + b2  E w2i  = E w2i 

„ i i i i 

However, 

Wi;  = X,;  W2i  = X f 

Thus  the  normal  equations  become: 

nb0  + b1  E X;  + b2  E X;  = E Yj 
i i i 

< boZXt+b.ZX?  +b2Exf  =EXiYi 

i i i i 

b0  Exf  + b,  Exf  + b2  Ex"  = Ex-  Yt 

The  equations  (1.207)  can  be  solved  for  b0,  bj  and  b2.  Extensions  to  polynomials  of 
higher  degree  are  obvious  and  the  solution  follows  in  the  same  manner. 

It  should  be  pointed  out  that  when  one  speaks  of  a linear  model  in  regression  the 
term  linear  means  linear  in  the  parameters  p0,  Pi,...,  Pp  and  not  in  the  independent 
variable  X.  Other  examples  of  linear  models  (linear  in  the  parameters)  are: 

Y = Pq  + PjlogX  + P2X  + e 

Y = P0  + P1e“X+P2X1/2+8 

Y = P0  + Pj  e~Xl  + P2X2  + P3X3  + e 

The  mentioned  regressions  are  linear  by  their  regression  coefficients  but  not  by 
their  independent  variables.  Nonlinearity  by  independent  variables  is  easily  brought 
down  to  linearity  through  these  changes: 


(1.206) 


(1.207) 
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W1  = logX;  W2=X 2;  =>  Y = P0  + P3  W1  + P2  W2  + e 

Wi=e-X.  w2=x1/2-,  =>  y = p0  + p1w1  + p2w2  + E 

W1=e~Xl;  W2  = X2;  W3=X3;  =>  Y = P0  + P3  W2  + P2  W2  + P3  W3  + e 

All  regression  equations,  linear  by  their  regression  coefficients,  are  analyzed  by 
thus  far  developed  methods.  If  an  equation  is  not  linear  by  coefficients  we  then  deal 
with  nonlinear  regression  equations,  the  analysis  of  which  is  very  complicated  and 
requires  iterative  procedures. 

For  instance,  Van  Laar’s  equation  for  the  coefficient  of  activity  of  binomial  mix- 
ture or  the  steam-liquid  balance,  is  nonlinear  by  coefficients  A and  B,  as  shown  in 
the  equation. 

lQgVi=  I Aax , (L208) 

V +(l-Xl)5 

Suppose  we  have  a set  of  empirical-experimental  data  for  two  variables  and  that 
the  obtained  data  may  be  described  by  linear  regressions.  Which  of  the  linear  regres- 
sions can  describe  the  experimental  data  will  be  known  after  checking  their  lack  of 
fit.  It  should  be  pointed  out  that  a large  number  of  linear  regressions  are  at  our  dis- 
posal and  that  some  of  them  have  for  this  particular  case  been  given  in  Fig.  1.24. 


0 2 4 6 8 10  12  14 

X 


Figure  1.24  Useful  forms  for  empirical  equations 

Tabular  values  of  linear  regression  coefficients  for  regressions  shown  in  Fig.  1.24 
are  given  in  Table  1.79. 
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Table  1.79  Regression  coefficients  of  linear  regressions 


Regression  General  form 

Regression  coefficients 

Curve 

bo 

bi 

b2 

A Y=bo+biX 

1 

1 

Ai 

10 

-1 

A-2 

1 

1 

0.05 

Bi 

B Y=b0+b1X+b2X2 

1 

1 

-0.05 

b2 

10 

-1 

0.05 

B3 

10 

-1 

-0.05 

b4 

C Y=bo+bi/X 

1 

10 

C, 

20 

-10 

C2 

One  can  notice  in  Fig.  1.24  that  the  form  of  some  linear  regressions,  especially 
group  B,  is  very  sensitive  to  a change  in  regression  coefficient  value. 

Example  1.48  [4] 

It  is  believed  that  the  effect  of  temperature  on  catalyst  activity  is  quadratic.  The  pro- 
posed model  is: 

Y=p0+plX+(32X2+£. 

Eight  different  temperatures  (coded  X data  below)  were  used.  The  resulting  activ- 
ities are  given  as  Y.  Determine  the  polynomial  regression  coefficients. 

Y:  0.846;  0.573;  0.401;  0.288;  0.209;  0.153;  0.111;  0.078; 

X:  2;  4;  6;  8;  10;  12;  14;  16; 

If  we  let  Wi=X  and  W2=X2  , the  model  reduces  to  the  form 

Y=|30+  PjW  +|32W2+£ 

Following  the  same  procedure  as  for  multiple  linear  regression,  the  values  of  the 
regression  coefficients  are: 

b0=l. 05652;  b^-0.13114;  b2=0.00447 

The  resulting  regression  equation  is  then: 

Y = 1.05652  - 0.13114  x X + 0.00447  x X2 


Example  1.49  [23] 

We  will  develop  an  equation  from  the  data  on  the  heat  capacity  of  benzene  vapor  as 
a function  of  temperature.  Experimental  values  are  shown  in  an  accompanying 
table: 


Cpcal/Kgmol:  19.65;  26.74;  32.80;  37.74;  41.75;  45.06;  47.83;  50.16; 
T K 300;  400;  500;  600;  700;  800;  900;  1000. 
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Based  on  analysis  of  point  distribution  on  a scatter  diagram,  the  plotted  data  are 
clearly  not  linear,  so  a three-term  polynomial  will  be  used: 

Cp=bo+biT+b2T2 
Cp^bo+b^ +b2 1 /T 

Cp-bo+biT 

Cp^bjT 

Calculate  the  linear  regression  coefficients  for  all  given  regression  models  and 
present  them  graphically..  The  linear  regression  Cp=p0+PiT+|32T2  is  brought  by  the 
method  of  least  squares  down  to  the  following  system  of  normal  equations: 

' nbo+b.ZT+hZT2  =ECP 
< KZT  + b.ZT2  +b2ZT3  =ZCpT 
A E T2  + b,  £ T3  + b2  £ T4  = £ CpT2 

From  the  data  given:  n=8 

J2  Cp=19.65+... +50. 16=301. 73 

T=300+... +1000=5200 

T2=3002+...+10002=3.8x106 

J2  T3=3003+...+10003=3.016x109 

T4=3004+...+10004=2.5316x1012 

CpT=19.65x300+...+50.16xl000=214115 

J2  CpT2=19.65x3002+...+50.16xl0002  =166.0315xl06 

Substituting  these  values  in  normal  equations  we  get  the  regression  coefficients: 
b0=-13.212  ; bx=0.12395  ; b2=-6.24xl0'5 
so  that  the  regression  becomes: 

Cp=-13.212+0.12395T-6.2400xl0'5T2 

Calculations  for  all  plotted  regressions  are  in  the  next  table. 
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Table  1.80  Calculation  and  experimental  values 


Temperature 

[K] 

Heat  capacity 
[cal/gmol  K] 

A 

Calculated  data  for  regressions 
B C 

D 

300 

19.65 

18.36 

18.08 

22.72 

16.90 

400 

26.74 

26.38 

27.53 

27.01 

22.54 

500 

32.80 

33.16 

33.66 

31.29 

28.17 

600 

37.74 

38.70 

38.25 

35.57 

33.81 

700 

41.75 

42.98 

41.80 

39.86 

39.44 

800 

45.06 

46.01 

44.82 

44.14 

45.08 

900 

47.83 

47.80 

47.44 

48.42 

50.71 

1000 

50.16 

48.34 

49.89 

52.71 

56.35 

A geometric  interpretation  of  all  four  linear  regressions  is  given  in  Fig.  1.25. 


▼ Heat  capacity  Cp  (cal/gmol  K) 


Figure  1.25  Compare  the  fit  between  the  data  and  the  plotted 
regression  equations 


1.6.4 

Nonlinear  Regression 

We  often  meet  mathematical  models  in  engineering  practice  that  are  not  linear 
either  by  their  regression  coefficients  or  their  independent  variables.  The  non  linear- 
ity by  their  independent  variables  belongs  to  polynomial  analysis,  and  this  was  ela- 
borated in  the  previous  section.  Nonlinearity  by  regression  coefficients,  however,  is 
a heavier  problem  and  it  is  nowadays  solved  by  iterative  procedure  helped  by  fast 
digital  computers.  The  procedures  of  determining  regression  coefficients  in  non- 
linear regressions  itself  is  given  in  reference  [22]. 
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A nonlinear  model  that  occurs  quite  frequently  is: 

Y = P0  x e''X  (1.209) 

This  model  is  usually  handled  by  means  of  talcing  the  natural  log  of  both  sides  of 
the  equations  yielding. 

lnY=ln|30+PiX  (1.210) 

Letting  Z=lnY,  a0=lnp0  and  pi=Pi,  the  model  thus  reduces  to  the  linear  model: 

Z=a0+a1X  (1.211) 

Nowadays  the  method  of  least  squares  is  applied  to  determine  the  regression  coef- 
ficients a0  and  a!. 

The  following  nonlinear  method  is  also  met  in  practice: 

Y=PoPiX  (1.212) 

This  nonlinear  model  becomes  linear  when  logarithms  and  substitutions  are 

introduced. 

logY=logP0+X  logPi  (1.213) 

Substitutions: 

Z=logY,  logP0=a0;  log  p^oq  Z=a0+a1X 

One  should  be  careful  in  using  transformations  such  as  the  above,  since  if  it  is 
assumed  that  the  original  variable  is  normally  distributed,  then  the  transformed 
variable  may  not  be.  The  homogeneity  of  variance  property  may  be  likewise  violated. 
Frequently,  however,  the  original  assumption  of  normality  may  not  be  justified  and 
the  transformed  variables  have  a distribution  closer  to  normal. 

Example  1.50  [4] 

The  vapor  pressure  of  water  absorbed  on  silica  gel  can  be  expressed  as  a function  of 
the  vapor  pressure  of  pure  water  for  various  gel  loadings  in  spacecraft  humidity- 
water-recovering  systems.  For  the  water  loading  of  0.1  lb  water/lb  dry  silica  gel,  the 
following  data  were  obtained: 

p,  absorbed  H20:  0.038;  0.080;  0.174;  0.448;  1.43;  5.13;  9.47; 

p,  pure  H20:  0.2  0.4  0.8  2.0  6.0  20.0  35.0 

A plot  of  the  p data  on  Log-Log  paper  yields  a straight  line  so  an  equation  of  the  form: 

Y=P0XPl. 

By  applying  logarithms  and  the  following  substitutions  we  get  the  linear  regres- 
sion model: 

logY=logP0+PilogX 

Substitutions:  Z=logY,  a0=logPo,  W=log  X so  that: 

Z=a0+plW 
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The  following  quantities  are  then  calculated: 

Z=- 0.254785;  W = 0.390065;  £ Z = -1.783492;  £ W = 2.730458 

Z2  = 5.400247;  £ W2  = 5.429265;  £ ZW  = 3.950120;  (£  W)2 

= 7.455401; 

(£Z)2=  3.3180843;  ZJ]  W = -4.869750. 

Coefficients  are  obtained  by  solving  normal  equations. 

Z=-0.670018+1.06452W 

By  conversion  into  original  variables  we  get: 

Y = 0.21379  x X106452 

Example  1.51 

In  a pilot  plant  for  producing  composite  rocket  propellants  a batch  of  propellant  was 
produced  with  the  idea  of  characterizing  it  by  measuring  the  linear  burning  rate  at 
different  pressures  in  Crawford’s  bomb.  The  following  values  were  obtained  experi- 
mentally: 

Vmm/s:  11.71;  13.63;  17.25;  18.92; 

P bar : 50.6;  75.3;  116.3;  138.8. 

Based  on  theoretical  and  empirical  knowledge  the  relationship  between  burning 
rate  and  pressure  had  the  form: 

V=bPn 

Determine  the  coefficients  in  the  given  nonlinear  regression  model. 

The  obtained  values  were: 

b=1.744 

n=0.482. 


1.7 

Correlation  Analysis 

Having  determined  that  a relationship  exists  between  variables,  the  next  question 
that  arises  is  that  of  how  closely  the  variables  are  associated.  The  strongest  and  clos- 
est relationship  between  variables  is  the  functional  relationship,  i.e.  the  relationship 
where  each  value  of  one  independent  variable  corresponds  to  the  exact  value  of  a 
dependent  variable.  A weak  relationship  between  variables,  subject  to  smaller  or 
greater  diversions,  is  called  correlation  or  stochastic. 

The  statistical  techniques  that  have  been  developed  to  measure  the  amount  of 
association  between  variables  are  called  correlation  methods.  A statistical  analysis  per- 
formed to  determine  the  degree  of  correlation  is  called  a correlation  analysis.  For 
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example,  the  circle  area  and  its  radius  are  functionally  connected,  while  the  variables 
that  give  the  burning  rate  and  pressure  of  a propellant  show  a correlational  and  sto- 
chastic relationship.  The  term  used  to  measure  correlation  is  referred  to  as  a correla- 
tion coefficient.  The  correlation  coefficient  measures  how  well  the  regression  equa- 
tion fits  the  experimental  data.  As  such,  it  is  closely  related  to  the  standard  error  of 
estimate. 

It  has  been  mentioned  before  that  the  first  orientation  on  the  form  of  relationship 
between  variables  is  given  by  graphic  presentation  of  experimental  values  in  a coor- 
dinate system.  Such  a graph  is  called  a scatter  diagram.  The  distribution  of  points  in 
the  scatter  diagram  determines  the  direction  and  form  of  the  relationship  and  up  to 
a point  its  strength.  Fig.  1.26  shows  cases  of  positively  stronger  and  negatively  weaker 
linear  correlations  as  well  as  cases  of  noncorrelation; 


XXX 
Figure  1.26  Cases  of  different  correlation's 


To  measure  the  strength  of  linear  relationship  between  X and  Y use  the  relation 
(1.179): 

Yi~Y=  (Y,  - Y)  + (Y,  - Y;)  (1.215) 

If  there  was  a full  and  functional  relationship  between  X and  Y factors,  then  all  Y; 
experimental  values  would  be  equal  to  the  values  from  regression  Y;  and  all  the  data 
points  in  the  scatter  diagram  would  fall  on  the  regression  line.  In  such  a case  there 
would  be  no  diversions  of  experimental  values  of  a dependent  variable  from  regres- 
sion. Namely,  the  second  member  of  the  right-hand  side  of  formula  (1.215)  would 
equal  zero.  Similarly,  by  analyzing  the  sum  of  squares  versus  Eq.  (1.177) 
SSTc=SSR+SSE  we  would  get  SSE=0,  or  a perfect  description  of  experimental  data  by 
regression  equation.  The  other  extreme  case  is  when  there  is  no  linear  connection 
between  variables.  In  that  case  all  the  values  of  the  real  regression  Y;  are  equal  to 
the  arithmetic  average  of  the  dependent  variable  Y,  or  the  first  number  of  the  right- 
hand  side  of  formula  (1.215)  would  equal  zero  SSR  =0.  The  coefficient  of  determination 
is  defined  in  accord  with  previous  explanation  r2: 


2 

r 


ssR 

sstc 


SSTC  SSE  _ ^ 

sstc 


ssF 

sstc 


(1.216) 
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The  coefficient  of  determination  is  that  proportion  of  the  total  variability  in  the 
dependent  variable  that  is  accounted  for  by  the  regression  equation  in  the  indepen- 
dent variable(s).  A value  for  r2  of  1 indicates  that  the  fitted  regression  equation 
accounts  for  all  the  variability  of  the  values  of  the  dependent  variable  in  the  sample 
data.  At  the  other  extreme,  a value  of  0 for  r2  indicates  that  the  regression  equation 
accounts  for  none  of  the  variability.  In  other  cases  it  has  values  between  zero  and 
one,  and  its  value  will  therewith  approach  one  if  the  linear  relationship  is  stronger 
and  zero  if  it  is  weaker.  The  coefficient  of  determination  root  square  is  called  the 
correlation  coefficient  r. 

According  to  all  things  said  for  the  coefficient  of  determination,  the  correlation 
coefficient  itself  is  a measure  of  the  strength  of  relationship  and  it  takes  values  be- 
tween -1  and  +1.  When  the  correlation  coefficient  nears  one  the  linear  relationship 
between  variables  is  strong,  and  when  it  is  close  to  zero  it  means  that  there  is  no 
linear  relationship  between  variables.  This,  however,  does  not  mean  that  there  is  no 
relationship  between  variables,  which  might  even  be  strong,  of  a certain  curved 
shape.  We  point  out  that  the  correlation  coefficient  is  an  indefinite  number,  i.e.  it 
does  not  depend  on  the  units  the  variables  have  been  expressed  in. 

The  following  is  accepted  as  an  empirical  rule: 

• correlation  coefficient  up  to  0.30  indicates  a weak  relationship  and  is  of 
uncertain  validity; 

• correlation  coefficient  between  0.50  and  0.70  indicates  a significant  relation- 
ship and  is  of  practical  importance; 

• correlation  coefficient  above  0.90  means  a strong  relationship. 

In  statistical  studies  it  is  often  more  convenient  to  determine  the  correlation  coef- 
ficient and  then  the  regression  equations. 


1.7.1 

Correlation  in  Linear  Regression 

For  the  simple  linear  regression  model,  Y=|30+P1X+e,  the  sum  of  squares  due  to 
regression  is: 


ssR  = b]j:(Xi-x)2 


Thus  for  the  simple  linear  model  we  have: 


.7  SSR  _ 


blE(x-x)2 


£(X-X)(Y-Y) 


SStc  E (Y;-y)  E (x-x)  E (Y;-y) 


since: 


h=- 


E(X— X)(Y-Y) 
E(x-x)2 


E(X—  X)(Y  — Y) 


we  have:  r = 


E (x-x)2£  (y.-y)2 


1/2 


(1.217) 


(1.218) 
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Example  1.52 

Referring  to  the  data  of  Example  1.42  and  using  Eq.  (1.218)  we  calculate  the  simple 
linear  correlation  coefficient  as: 

2 = — 16775 = = y-o.994  = 0.996 

82.5x342.1 

indicating  that  the  regression  equation  accounts  for  99.4%  of  the  variability  of  the 
data  about  X.  Since  £ (X;  - X)  (Yi  - Y)=167.5,  r = v/0.994  = 0.996.  This  means 
that  X and  Y are  positively  correlated.  This  means  that  as  X increases  or  decreases 
the  corresponding  values  of  Y increase  or  decrease,  accordingly.  This  also  implies 
that  the  slope  of  the  regression  line  is  positive.  In  this  example,  the  value  of  the  cor- 
relation coefficient  is  quite  high,  r=0.996,  indicating  a “strong”  linear  relationship. 

Part  of  the  variability  0.6%  is  not  explained  by  the  regression  model  and  it  is  the 
consequence  of  not  taking  into  account  all  the  factors  affecting  the  response  variabil- 
ity, of  not  choosing  the  right  form  of  a regression  model,  and  of  measurement 
errors.  We  have  mentioned  earlier  that  the  square  root  of  coefficient  of  determina- 
tion gives  the  correlation  coefficient  r=0.996  a positive  one  as: 


E(Xi-X)(Yi-Y)  =+167.5 

i 

The  positive  correlation  means:  if  X increases  or  decreases  the  corresponding  val- 
ues of  Y increase  or  decrease  too.  The  correlation  coefficient  may  also  be  expressed 
by  covariance  of  sample  Sxy  as: 

(1-219) 

so  that: 

r = lxL_  (1.220) 

SXSY 

Coefficient  of  determination-regression  statistical  significance-lack  of  fit  of  regression 

It  has  already  been  mentioned  that  the  coefficient  of  determination  is  that  propor- 
tion of  the  total  variability  in  the  dependent  variable  that  is  accounted  for  by  the 
regression  equation  in  the  independent  variable(s).  A value  for  r2  of  1 indicates  that 
the  fitted  regression  equation  accounts  for  all  the  variability  of  the  values  of  the  de- 
pendent variable  in  the  sample  data.  At  the  other  extreme,  a value  of  0 for  r2  indi- 
cates that  the  regression  equation  accounts  for  none  of  the  variability. 

A conclusion  cannot,  however,  be  drawn  that  the  high  value  of  the  coefficient  of 
determination  simultaneously  means  a statistical  significance  of  regression.  In  fact, 
one  can  obtain  a value  of  1 for  r2  by  simply  fitting  a regression  equation  that 
includes  as  many  (statistically  estimable)  terms  as  there  are  observations  (i.e.,  data 
points).  When  the  number  of  observations  exceeds  the  number  of  terms  in  the 
regression  equation  by  only  a small  number  then  the  coefficient  of  determination 
might  be  large,  even  if  there  is  no  true  relationship  between  the  independent  and 
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dependent  variables.  For  example,  the  chances  are  one  in  ten  of  obtaining  a value  of 
r2  as  high  as  0.9756  in  fitting  a simple  linear  regression  equation  to  the  relationship 
between  an  independent  variable  X and  a normally  distributed  dependent  variable  Y 
based  on  only  3 observations,  even  if  X is  totally  unrelated  to  Y,  i.e.,  this  result  can 
occur  10%  of  the  time,  even  if  the  two  variables  are  unrelated.  On  the  other  hand, 
with  100  observations  a coefficient  of  determination  of  0.07  is  sufficient  to  establish 
statistical  significance  of  a linear  regression  at  the  1%  level.  More  generally, 
Table  1.81  indicates  the  value  of  r2  required  to  establish  statistical  significance  for  a 
simple  linear  regression  equation. 

2 

Table  1.81  Values  of  r for  a simple  regression 


Sample 

size 

Statistical  significance  level 

<x=0.1 

a=0.05 

a=0.01 

3 

0.9756 

0.9938 

0.9998 

4 

0.810 

0.9030 

0.9800 

5 

0.65 

0.77 

0.92 

6 

0.53 

0.66 

0.84 

7 

0.45 

0.57 

0.77 

8 

0.39 

0.50 

0.70 

9 

0.34 

0.44 

0.64 

10 

0.03 

0.40 

0.59 

11 

0.27 

0.36 

0.54 

12 

0.25 

0.33 

0.50 

13 

0.23 

0.31 

0.47 

14 

0.21 

0.28 

0.44 

15 

0.19 

0.26 

0.41 

20 

0.14 

0.20 

0.31 

25 

0.11 

0.16 

0.26 

30 

0.09 

0.13 

0.22 

40 

0.07 

0.10 

0.16 

50 

0.05 

0.08 

0.13 

100 

0.03 

0.04 

0.07 

Note  that  Table  1.81  applies  only  for  a simple  linear  regression  equation.  For  the 
case  of  multiple  regression,  statistical  significance  of  the  overall  regression  equation 
can  be  determined  by  the  F-ratio  in  the  analysis  of  variance  [22].  Practical  signifi- 
cance and  statistical  significance  are  not  equivalent.  With  a small  sample,  it  is  possi- 
ble not  to  obtain  any  evidence  of  a statistically  significant  regression  relationship  be- 
tween two  variables  even  if  their  true  relationship  is  quite  strong.  This  is  because,  as 
seen  above,  a relatively  high  value  of  r2  is  required  to  show  a regression  equation  to 
be  statistically  significant  when  only  a small  number  of  observations  are  used.  On 
the  other  hand,  a regression  equation  based  on  only  a modest  (and  practically  unim- 
portant) true  relationship  may  be  established  as  statistically  significant  if  a suffi- 
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ciently  large  number  of  observations  are  available.  For  example,  it  was  seen  that 
with  100  observations  a value  for  r2=0.07  was  sufficient  to  establish  a highly  signifi- 
cant statistical  linear  relationship  between  two  variables. 

Furthermore,  the  magnitude  of  r2  depends  directly  on  the  range  of  variation  of  the 
independent  variables  for  the  given  data.  The  coefficient  of  determination  thus 
decreases  with  a decrease  in  the  range  of  variation  of  the  independent  variables,  assum- 
ing the  correct  regression  model  is  being  fitted  to  the  data.  For  example,  Fig.  1.27  shows 
the  fitted  regression  equation  between  an  independent  variable,  X,  and  a dependent 
variable,  Y,  based  on  110  equally  spaced  values  of  X over  the  range  from  10  to  20. 

The  estimated  coefficient  of  determination  is  r2=0.89.  However,  if  one  had  available 
only  the  30  observations  in  the  range  14  to  16  (see  Fig.  1.28),  the  resulting  coeffi- 
cient of  determination  from  the  fitted  regression  equation  would  be  only  r2=0.21. 

Thus  a large  value  of  r2  might  reflect  the  fact  that  the  data  had  been  obtained  over 
an  unrealistically  large  range  of  variation.  Conversely,  a small  value  of  r2  might  be 
due  to  the  limited  range  of  the  independent  variables.  This  is  sometimes  the  case  in 
analyzing  data  from  a manufacturing  process  in  which  normal  plant  practice 
restricts  the  range  of  the  process  variables.  Note  also  that  a large  and  statistically  sig- 
nificant coefficient  of  determination  does  not  assure  that  the  chosen  regression 
model  adequately  represents  the  true  relationship  for  all  purposes.  A coefficient  of 
determination  of  r2  =0.99,  even  if  statistically  significant , for  a regression  model  in- 
volving only  linear  terms  for  each  of  independent  variables,  does  not  mean  that  a 
model  that  also  includes  quadratic  and  interaction  terms  could  not  conceivably  yield 
a significantly  better  fit,  nor  that  the  “real  cause  variables”  have  been  included  in  the 
regression  equation. 


10  12  14  16  18  20  22  X 


Figure  1.28  Data  plots  with  r 2 = 0.21 


Figure  1.27  Data  plots  with  r =0.89 
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1.7.2 

Correlation  in  Multiple  Linear  Regression 


In  multiple  linear  regression  where  the  model  is: 

Y=p0+|31X1+|32X2  +...+(3pXp+e  (1.221) 

the  coefficient  of  determination  is,  according  to  Eq.  (1.203): 


r2=^  = 

SSTC 


\ E (Xp-Xj  (Y.-Y)+...+bp  £ (xp  -XP)  (Y.-Y) 


E(x-x) 


(1.222) 


Eq.  (1.222)  is  analogous  to  (1.217).  The  coefficient  r2  as  defined  by  Eq.  (1.222)  is 
called  the  multiple  coefficient  of  determination  or  the  multiple  correlation  coefficient  r. 


Example  1.53 

The  multiple  coefficient  of  determination  may  be  obtained  for  the  data  of  Example  1.47 
as  a means  of  determining  the  “goodness  of  fit”  of  the  regression  equation  already 
estimated.  Eq.  (1.222)  is  used  to  give: 

2 A E fezEl (y,-y)+fr2  E (x7,-x2)  (y -y) 

E(x-y) 

therefore, 

r2=(-0. 12748x10937. 84+3. 60271x2959. 17)/9274.97=0. 9991;  r=0.9995. 

Such  a high  correlation  coefficient  indicates  that  the  regression  model  describes 
the  experimental  data  extremely  well.  Apart  from  the  mentioned  multiple  correla- 
tion coefficient  the  following  partial  coefficient  of  determination: 

r\x2  =0.8712;  r\Y= 0.8288;  r\Y= 0.9956 

2 

It  is  clear  that  from  the  partial  coefficient  of  determination  rx,  y=0.9956  and  multi- 
pie  coefficient  of  determination  rx1x2y=0.9991  very  little  was  gained  by  adding  X!  to 
the  correlation. 


Problem  1.49  [22] 

A study  was  made  on  the  effect  of  temperature  on  the  yield  of  a che- 
mical process.  The  following  data  (in  coded  form)  were  collected: 


X:  -5;  -4;  -3;  -2;  -1;  0;  1;  2;  3;  4:  5; 

Y:  1;  5;  4;  7;  10;  8;  9;  13;  14;  13;  18; 


Determine: 


• Assuming  a model,  Y=|30+|3iX+e,  what  are  the  least  squares  estimates 
of  regression  coefficients. 

• Do  analysis  of  variance  for  significance  level  a=0.05. 

• What  are  the  confidence  limits  for  1 (a=0.05). 
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• What  are  the  confidence  limits  (1-a  =95%)  for  the  true  mean  value  of 

Y when  X=3. 

• What  are  the  confidence  limits  (1-a  =95%)  for  the  true  mean  value  of 

Y when  are  X=3  and  X=-2. 


Problem  1.50  [22] 

Thirteen  specimens  of  90/10  Cu-Ni  alloys,  each  with  a specific  iron 
content,  were  tested  in  a corrosion-wheel  setup.  The  wheel  was 
rotated  in  salt  sea  water  at  30  ft/sec  for  60  days.  The  corrosion  was 
measured  in  weight  loss  in  milligrams/square  decimeter/day,  MDD. 
The  following  data  were  collected: 


X%Fe: 

0.01; 

0.48; 

0.71; 

0.95; 

1.19; 

0.01; 

Y loss  in  MDD: 

127.6; 

124.0; 

110.8; 

103.9; 

101.5; 

130.1; 

X%Fe: 

0.48; 

1.44; 

0.71; 

1.96; 

0.01; 

1.44; 

1.96; 

Y loss  in  MDD: 

122.0; 

92.3; 

113.1; 

83.7; 

128.0; 

91.4; 

86.2. 

Determine  coefficients  in  the  linear  regression  model  and  do  ana- 
lysis of  variance  taking  into  account  check  of  lack  of  fit  of  the 
obtained  regression  model. 


Problem  1.51  [24] 

Two  colorimetric  methods  were  compared  by  measuring  the  con- 
tents of  a chemical  component.  Based  on  experimental  results  deter- 
mine whether  there  exists  linear  regression  dependence  between 
the  method’s. 


Method  I:  3720;  4328;  4655;  4818;  5545;  7278;  7880;  10085;  11707; 

Method  II:  5363;  6195;  6428;  6662;  7562;  9184;  10070;  12519;  13980. 


Problem  1.52  [24] 

Temperature  functions  were  mechanically  tested  for  prepared  sap- 
phire samples.  Find  the  linear  regression  dependence  between  the 
measured  Young' s-modulus  and  temperature. 


X °C: 

30; 

100; 

200; 

300; 

400; 

500; 

600; 

700; 

800; 

Y: 

4642; 

4612; 

4565; 

4513; 

4476; 

4433; 

4389; 

4347; 

4303; 

X°C 

900; 

1000; 

1100; 

1200; 

1300; 

1400; 

1500; 

Y: 

4251; 

4201; 

4140; 

4100; 

4073; 

4024; 

3999. 

Problem  1.53  [10] 

Two  procedures  were  tested  in  developing  a method  for  measuring 
blood  flow.  Based  on  obtained  results  determine  whether  there 
exists  linear  correlation  between  the  procedures,  and  if  there  is,  give 
the  linear  regression  analysis  of  variance. 


X: 

1190; 

1455; 

1550; 

1730; 

1745; 

1770; 

1900; 

1920; 

1960. 

Y: 

1115; 

1425; 

1515; 

1795; 

1715; 

1710; 

1830; 

1920; 

1970. 

X: 

2295; 

2335; 

2490; 

2720; 

2710; 

2530; 

2900; 

2760; 

3010. 

Y: 

2300; 

2280; 

2520; 

2630; 

2740; 

2390; 

2800; 

2630; 

2970. 
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Problem  1.54 

Moisture  content  in  the  mixture  of  a product  has  an  influence  on 
density  of  the  final  product.  The  moisture  of  the  analyzed  mixture 
has  been  controlled  and  the  density  of  the  final  product  measured. 
Experimental  values  offered: 


X%: 


4.7; 


Yg/cm  : 3; 
Determine: 


5.0; 

3; 


5.2; 

4; 


5.2; 

5; 


5.9; 

10; 


4.7; 

2; 


5.9; 

9; 


5.2; 

3; 


5.3; 

7; 


5.9; 

6; 


5. 

6; 


a)  linear  regression  model  Y=po+p!X+E; 

b)  95%  confidence  interval  for  Pj  ; 

c)  analysis  of  variance  and  check  lack  of  fit  of  the  model. 


q Problem  1.55  [4] 

The  relation  between  the  heat  capacity  of  liquid  sulfuric  acid  in 
cal/g  °C  and  temperature  in  °C  is  as  follows: 

Ccal/g  °C:  0.377;  0.389;  0.396;  0.405;  0.466;  0.458; 

T °C:  50;  100;  150;  200;  250;  300 

Determine  regression  coefficients  in  the  linear  regression: 

Cp=Po+PiT+e. 


Problem  1.56  [4] 

The  irritant  factor  Y of  polluted  air  can  be  determined  as  a function 
of  the  concentrations  of  SO2  and  NO2  in  the  atmosphere.  The  fol- 
lowing data  are  available  where  Xi  = parts  N O2  per  ten  million  parts 
of  air  and  X2  = parts  SO2  per  hundred  million  parts  of  air.  Deter- 
mine the  irritant  factor  as  a function  of  Xi  and  X2: 


X,: 

10; 

12; 

15; 

16; 

19; 

21; 

25; 

28; 

X2: 

12.5; 

15; 

18; 

21; 

26; 

30; 

35; 

40; 

Y: 

65; 

72; 

82; 

95; 

110; 

122; 

125 

130. 

Problem  1.57 

In  the  production  of  ethylene  glycol  from  ethylene  oxide,  the  conver- 
sion of  ethylene  to  ethylene  oxide,  X,  is  a function  of  the  activity  Zi 
of  the  silver  catalyst  and  the  residence  time  Z2  . the  following  coded 
data  are  available: 


X: 

12.1; 

11.9;  10.2; 

8.0; 

7.7; 

5.3; 

7.9; 

7.8; 

5.5;  2.6; 

Zi: 

0; 

1;  2; 

3; 

4; 

5; 

6; 

7; 

8;  9; 

Z2: 

7; 

4;  4; 

6; 

4; 

2; 

1; 

1; 

1;  0. 

a)  Write  a suitable  model; 

b)  What  portion  of  the  data  does  your  regression  equation  explain? 


6;  5.0; 

4. 
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Problem  1.58  [4] 

Two  supposedly  identical  Brook’s-model  R-2-65-5  rotameters  with 
316  stainless-steel  spherical  floats  were  calibrated  for  helium  service 
at  20  psig  input,  74  °F  . Let  Y mlHe/min.  =flow  rate  and  X mm=scale 
reading.  The  data  are  below; 


X: 

10; 

10; 

14; 

16; 

19; 

20; 

25; 

30; 

30; 

35; 

Yi: 

9.2; 

9.5; 

-J 

15.0; 

-J 

21.3; 

29.4; 

41.0; 

40.5; 

54.0; 

Y2: 

8.4; 

8.6; 

12.6; 

18.6; 

18.0; 

27.0; 

36.8; 

36.3; 

49.1. 

X: 

35; 

40; 

40; 

45; 

45; 

50; 

50; 

55; 

55; 

60; 

Yi: 

55.0; 

68.0; 

68.8; 

86.0; 

88.1; 

103.2; 

104.6; 

124.0; 

123.1; 

144.0 

Y2: 

50.0; 

64.8; 

64.8; 

81.3; 

82.0; 

100.4; 

97.0; 

114.7; 

117.0; 

134.0 

Problem  1.59  [25] 

Magnetic  material  is  mechanically  separated  from  the  slurry  of 
ground  ore  and  rolled  into  balls  that  are  to  be  sent  to  furnaces  for 
producing  small  balls.  To  reinforce  the  material  better  and  to  give  it 
greater  hardness  a binder  such  as  natural  peat  is  usually  added.  The 
content  of  the  binder  has  an  effect  on  ball  hardness,  as  can  be  seen 
from  experimental  values. 


Y:  3.6;  9.8;  14.7;  16.2;  16.0;  15.5. 

X:  0.0;  4.0;  8.0;  12.0;  16.0;  20.0. 


Determine  the  relationship  between  grinding  and  content  of  the 
binder. 


Problem  1.60 

The  temperature  effect  on  bleaching  of  a final  product  was  deter- 
mined experimentally.  The  obtained  data  are: 


XK: 

460; 

450; 

440; 

430; 

420; 

410; 

450; 

440; 

Y bleaching  degree: 

0.3; 

0.3; 

0.4; 

0.4; 

0.6; 

0.5; 

0.5; 

0.6; 

XK: 

430; 

420; 

410; 

400; 

420; 

410; 

400 

Y bleaching  degree: 

0.6; 

0.6; 

0.7; 

0.6; 

0.6; 

0.6; 

0.6; 

Determine: 


a)  Linear  regression  model  Y=[30+|3iX+e. 

b)  Analysis  of  variance  and  check  lack  of  fit  of  the  regression  model. 

c)  95%  confidence  interval  for  the  mean  Y for  X values. 


156  | 1 Introduction  to  Statistics  for  Engineers 

References 

1 Simonovic,  D.,Vukovic,  D.,  Cvijovic,  S, 
Djurdjevic,  K.S.,  Teh rioloske  operacije,  TMF, 
Beograd,  1973. 

2 Andersen,  B.L.,  Chemical  Engineering, 
October  29,  119-124,  1962. 

3 Vukadinovic,  V.S.,  Elementi  teorije  verovat- 
noce  i matematicke  statistike,  Privredni 
Pregled,  Beograd,  1978. 

4 Bethea,  M.R.,  Duran,  S.B.,  Boullion,  L.T.,  Sta- 
tistical Methods  for  Engineers  and  Scientists, 
Marcel  Dekker,  N.Y.,  1975. 

5 Bennett,  C.A.,  Franklin,  N.L.,  Statistical  Analy- 
sis in  Chemistry  and  the  Chemical  Industry, 
Wiley,  N.Y.,  1954. 

6 Lindgren,  B.W.,  Statistical  Theory,  Macmillan, 
N.Y.,  1962. 

7 Andersen,  B.L.,  Chemical  Engineering, 
December  24,  83-86,  1962. 

8 Andersen,  B.L.,  Chemical  Engineering, 
January  21,  117-120,  1963. 

9 Davies,  L.O.,  The  Design  and  Analysis  of 
Industrial  Experiments,  Oliver  and  Boyd,  Lon- 
don, 1963. 

10  Brownlee,  K.A.,  Statistical  Theory  and  Meth- 
odology in  Science  and  Engineering,  Wiley, 
N.Y.,  1960. 

11  Andersen,  B.L.,  Chemical  Engineering, 

April  15, 157-162,  1963. 


12  Hicks,  R.C.,  Fundamental  Concepts  in  the 
Design  of  Experiments,  N.Y.,  1965. 

13  Dudukovic,  B.,  Hemijska  Industrija,  br.  5, 
247-250, 1976. 

14  Lazic,  R.Z,  Vukovic,  V.D.,  Naucno  Tehnicki 
Pregled,  br.  4,  Vol.  XXX,  29-35,  1980. 

15  Andersen,  B.L.,  Chemical  Engineering, 
September  2,  99-105, 1963. 

16  Lazic,  R.Z.,  Naucno  Tehnicki  Pregled,  br.  10, 
Vol.  XXIX,  32-39, 1979. 

17  Pepic,  P,  Naucno  Tehnicki  Pregled,  br.  3, 

Vol.  XXXVI,  10-13,  1986. 

18  Mouradian,  G,.  Ind.  Qual.  Control.,  22, 
516-520,  1966. 

19  Barttlet,  M.S.,  Proc.  Roy.  Soc,  A901,  160, 
273-275,  1964. 

20  Hadjivukovic,  S.,  Tehnika  Metoda  Uzorka, 
Naucna  Knjiga,  Beograd,  1975. 

21  Himmelhblau,  D.,  Analiz  Processov  Statisti- 
ceskim  Metodami,  Per.  s Angl.,  Moskva, 
“Mir”,  1970. 

22  Draper,  N.R,  Smith,  H.,  Applied  Regression 
Analysis,  Wiley,  New  York,  1966. 

23  Andersen,  B.L.,  Chemical  Engineering, 
Mayl3, 173-178,  1963. 

24  Natrella,  G.M.,  Experimental  Statistics,  NBS, 
1963. 

25  White,  R.G.,  Ind.  Eng.  Chem.,  53,  215-216, 
1961. 


157 


II 

Design  and  Analysis  of  Experiments 

2.0 

Introduction  to  Design  of  Experiments  (DOE) 

Design  of  experiments,  like  any  other  scientific  discipline,  has  its  own  terminology, 
methodology  and  subject  of  research.  The  title  of  this  scientific  discipline  itself 
clearly  indicates  that  it  deals  with  experimental  methods.  A large  number  of  experi- 
ments is  done  in  research,  development  and  optimization  of  the  system.  This 
research  is  done  in  labs,  pilot  plants,  full-scale  plants,  agricultural  lots,  clinics,  etc. 
An  experiment  may  be  physical,  psychological  or  model  based.  It  may  be  performed 
directly  on  the  subject  or  on  its  model.  The  model  usually  differs  from  the  subject  in 
its  dimensions  and  sometimes  in  its  nature.  The  experiment  may  also  be  done  on 
an  abstract  mathematical  model.  When  a model  describes  the  subject  precisely 
enough,  the  experiment  on  the  subject  is  generally  replaced  by  an  experiment  on 
the  model.  Lately,  due  to  a rapid  development  of  computer  technology,  physical 
models  are  more  frequently  replaced  by  abstract  mathematical  ones. 

An  experiment  takes  a central  place  in  science,  particularly  nowadays,  due  to  the 
complexity  of  problems  science  deals  with.  The  question  of  efficiency  of  using  an 
experiment  is  therefore  imposed.  J.  Bernal  has  made  an  estimation  that  scientific 
research  is  organized  and  done  fairly  chaotically  so  that  the  coefficient  of  its  usability 
is  about  2%.  To  increase  research  efficiency,  it  is  necessary  to  introduce  something 
completely  new  into  classical  experimental  research. 

One  kind  of  innovation  could  be,  to  apply  statistical  mathematical  methods  or  to 
develop  design  of  experiments-DOE.  DOE  is  a planned  approach  for  determiniing  cause 
and  effect  relationships. 

Hereby,  the  following  is  essential: 

• reduction  or  minimization  of  total  number  of  trials; 

• simultaneous  varying  of  all  factors  that  formalizes  experimenter’s  activities; 

• choice  of  a clear  strategy  that  enables  reliable  solutions  to  be  obtained  after 
each  sequence  of  experiments. 

The  methodology  of  design  of  experiments  has  in  developed  countries  made  a 
special  expansion  in  solving  very  complex  problems  in  all  fields  of  human  activities. 
It  should  be  pointed  out  that  an  important  place  in  this  expansion  was  the  develop- 
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ment  of  electronic  computers,  for  they  greatly  accelerated  and  alleviated  statistical 
calculations. 

Chemical  and  engineering  studies,  as  for  those  in  other  fields,  are  based  on  com- 
plex, long-term  and  relatively  expensive  experiments.  Experimental  work  is  included 
in: 


• physical  and  chemical  studies  for  establishing  constants  and  properties  of 
elements,  chemical  compounds  and  materials; 

• routine  analyses  of  raw  materials,  intermediates  and  final  products; 

• lab  studies  for  designing  and  developing  technological  processes; 

• optimization  of  technological  procedures  in  the  lab,  pilot-plant  and  full-scale 
plant  systems; 

• optimization  of  mixture  or  “ composition-properties 

• mathematical  modeling  of  a system; 

• selection  of  factors  by  the  significance  of  their  effects  on  a measured  value- 
response; 

• estimates  and  definitions  of  theoretic  model  constants,  etc. 

Hence,  wherever  experiments  exist  there  should  be  new  scientific  disciplines 
dealing  with  their  designing  and  analysis. 

The  efficiency  of  experimental  research  is  determined  by  the  degree  of  precision 
and  completeness  of  data  and  information  about  the  system  that  is  being  tested. 
This  degree  results  from  applying  the  methodology  of  design  on  the  experiments 
and  on  the  way  the  obtained  experimental  data  are  analyzed.  It  is  important  at  this 
point  to  consider  the  manner  in  which  the  experimental  data  were  collected  as  this 
greatly  influences  the  choice  of  the  proper  technique  for  data  analysis.  Before  going 
any  further  it  is  well  to  point  out  that  the  person  performing  the  data  analysis 
should  be  fully  aware  of  several  things: 

• What  is  the  objective  of  the  research? 

• What  is  considered  a significant  research  finding? 

• How  are  the  data  to  be  collected  and  what  are  the  factors  that  effect  the 
responses? 

If  an  experiment  has  been  properly  designed  or  planned,  the  data  will  be  collected 
in  the  most  efficient  form  for  the  problem  being  considered.  Experimental  design  is 
the  sequence  of  steps  initially  taken  to  insure  that  the  data  will  be  obtained  in  such  a 
way  that  its  analysis  will  lead  immediately  to  valid  statistical  inferences.  Before  a 
design  can  be  chosen,  the  following  questions  must  be  answered: 

• How  to  measure  the  response  and  the  factor’s  effect? 

• How  many  of  the  factors  will  affect  the  response? 

• How  many  of  the  factors  will  be  considered  simultaneously? 

• How  many  replications  (repetitions)  of  the  experiment  will  be  required? 

• What  type  of  data  analysis  is  required  (regression,  ANOVA,  etc.)? 

• What  level  of  difference  in  effects  is  considered  significant? 
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The  purpose  of  statistically  designing  an  experiment  is  to  collect  the  maximum 
amount  of  relevant  information  with  a minimum  expenditure  of  time  and 
resources.  It  is  important  to  remember  also  that  the  design  of  experiment  should  be 
as  simple  as  possible  and  consistent  with  the  requirements  of  the  problem.  Hence, 
design  of  experiments  requires  a new  approach  to  research,  which  is  far  from  the 
traditional  (classical)  methods  of  empirical  research.  The  traditional  approach 
demands  considerable  material  expense  and  is  more  time  consuming,  for  the  effect 
of  each  factor  experiment  may  be  designed  to  investigate  one  factor  at  a time  so  that 
all  other  independent  variables  (factors)  are  held  constant.  This  is  the  so-called  clas- 
sical experimental  design  and  is  the  one  that  has  been  favored  almost  exclusively 
among  scientists  and  engineers.  At  the  same  time,  the  factors  have  no  more  than  4 
or  5 different  values  (levels  of  variation)  as  the  total  number  of  trials  is  particularly 
big.  If,  for  instance,  the  effect  of  five  factors  is  to  be  tested  where  each  of  them  may 
be  varied  at  five  levels,  then  for  the  complete  testing  of  the  research  subject  it  is  nec- 
essary to  realize  5S=3125  different  combinations  of  factors-trials  with  no  trial  replica- 
tions meant  to  reduce  experimental  errors.  The  plotted  number  of  classical  experi- 
mental design  points  is  hard  to  realize,  so  that  in  practice  their  number  is  reduced 
at  the  expense  of  either  reducing  the  investigated  factor  space-domain  or  the  num- 
ber of  factor  levels.  In  both  cases,  the  confidence  of  conclusions,  based  on  experi- 
mental results,  is  reduced. 

Besides,  a significant  part  of  information  obtained  in  a similar  way  is  of  no  practi- 
cal use  for  it  refers  to  the  region  of  factor  space-domain,  which  is  far  from  its  opti- 
mum. Even  more  drastic  errors  are  possible  if  all  the  necessary  trials  are  done.  How- 
ever, due  to  the  huge  time  consumption,  uncontrolled  changes  in  the  quality  of  inlet 
raw  materials  or  in  the  experimental  plant  are  not  accounted  for.  The  first  and  final 
trial  results  of  an  experimental  program  are  not  comparable  from  the  accuracy  point 
of  view.  As  an  important  drawback  of  classical  experimenting,  there  also  appears  the 
fact  that  it  is  impossible  to  single  out  the  effects  of  interactions  between  the  ana- 
lyzed factors.  This  has  a great  influence  on  the  errors  in  estimating  the  responses  as 
functions  of  observed  factors.  An  additional  difficulty  also  arises  in  an  estimate  on 
the  lack  of  fit  of  the  obtained  mathematical  model  since  the  experimental  error  is 
usually  missing.  Finally,  interpreting  the  results  of  a classical  experiment  becomes 
difficult,  because  a simultaneous  analysis  is  impossible  due  to  a large  number  of 
tables  and  graphs. 

Most  of  these  problems  can  be  avoided  by  applying  the  design  of  experiments 
and  a simultaneous  increase  in  efficiency  of  empirical  research.  The  consumption 
of  research  time  may  be  reduced  ten  or  more  times.  Referring  to  the  example  where 
five  factors  are  analyzed,  it  is  possible  to  do  the  designed  experiment  with  32  trials 
only  by  using  rotatable  design  of  second  order.  Cases  are  known  when,  by  applying 
the  design  of  experiments,  an  optimal  solution  has  been  reached  and  where  a classi- 
cal experiment  had  no  solution  in  a reasonable  time  period. 

By  using  the  design  of  experiments,  a researcher’s  intuition  is  developed  and  his 
way  of  thinking  changed.  It  may  therefore  be  said,  that  the  design  and  analysis  of  an 
experiment  is  a scientific  method  in  elaborating  experimental  results,  in  finding 
optimal  solutions  and  in  research  that  has  the  experiment  as  their  subject.  Design 
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of  experiments  also  uses  a traditional  approach  in  the  research,  namely  the  use  of 
experimental  data  to  obtain  a mathematical  model  of  a system.  In  a general  case 
and  from  a mathematical  point  of  view,  the  used  mathematical  models  may  end  up 
being  complicated  mathematical  functions. 

Response,  aim  function  or  optimization  criterion  may  have  the  form: 

y=f(xi;  z;,  W; ) (2.1) 

where: 

y is  response,  aim  function,  optimization  criterion; 

X;  are  the  controllable  independent  variables,  factors; 

Z;,W;  are  variables  and  constants  that  affect  y but  are  uncontrollable; 
f is  the  function  that  defines  y,  x;,  z;,  Wj  relationships. 

Besides,  one  should  also  keep  in  mind  the  equations  and  non-equations  that 
define  the  constraints  of  controllable  factors.  Equation  (2.1)  defines  the  constraints 
of  a research  subject.  Research  solutions  may  be  considered  optimal  if  they  are  the 
maximum  and  minimum  of  the  response  function  for  the  given  constraints. 

It  has  to  be  remembered  that  each  model  is  an  approximate  solution  and  gener- 
ally is  not  a correct  description  of  the  research  subject.  Optimal  solution  of  a model 
is  therefore  considered  an  approximate  optimum  of  the  real  system.  This  assertion 
is  both  good  and  bad.  The  good  side  is  that  the  models  are  not  complicated,  since,  to 
be  close  to  the  real  system,  they  would  have  to  be  very  complex.  On  the  other  hand, 
insufficient  reality  of  a model  reduces  the  solution  confidence. 

In  classical  research  methods,  the  main  objective  is  to  define  the  rule/law,  which 
has  the  property  of  an  absolute  category,  at  a given  level  of  knowledge.  The  law  is 
either  unconditionally  correct  or  not.  Such  an  approach  makes  studying  a complex 
system  difficult,  for  when  many  factors  have  complex  effects  it  is  difficult  to  find  the 
correct  mathematical  system  in  accord  with  the  laws.  Also,  approximate  solutions 
are  senseless  for  we  cannot  talk  about  “bad”  and  “good”  laws.  In  the  new  approach  to 
solving  problems,  or  in  design  of  experiments,  the  mathematical  model  is  not  abso- 
lute. It  only  offers  an  approximate  idea  on  the  research  subject  and  one  may  speak 
of  “good”  and  “bad”  mathematical  models.  The  essence  of  design  of  experiments  is 
that  it  enables  optimal  solutions  to  be  obtained  even  when  it  is  really  impossible  to 
get  a functional  (deterministic)  mathematical  model  and  define  a rule  precisely.  It  is 
characteristic  for  design  of  experiments  that  it  uses  polynomial  models  since  the 
quality  of  approximation  may  be  improved  by  increasing  a polynomial  degree.  Such 
models  are  especially  suitable  for  solving  optimization  problems  as  it  makes  it  possi- 
ble to  take  into  account  the  effects  of  interaction  and  a large  number  of  factors. 
Besides,  it  makes  it  easy  to  estimate  the  degree  of  lack  of  fit  of  polynomial  models  of 
different  orders. 

A designed  or  active  experiment  is  based  on  using  general  methodological  con- 
cepts such  as  regression  and  correlation  analysis,  analysis  of  variance,  randomiza- 
tion, optimal  use  of  factor  space,  successive  experimenting,  replication,  compact- 
ness of  information,  statistical  estimates,  etc. 

The  regression  analysis  mathematical  apparatus  is  used  in  the  design  of  experi- 
ments. It  is  therefore  suggested  to  take  into  account  assumptions  of  regression  anal- 
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ysis  when  performing  an  experiment.  This  means  that  the  trial  results  are  indepen- 
dently and  normally  distributed  random  values  of  equal  variances.  In  other  words, 
the  experimental  results  in  each  trial  are  obtained  with  certain  probability  so  that 
the  distribution  of  such  values  in  each  trial  is  subject  to  the  normal  distribution  law, 
and  variances  typical  for  them  are  practically  equal.  The  law  on  the  distribution  of 
experiment  results  is  observed  because,  the  random  value  is  defined  if  its  distribu- 
tion law  is  known.  The  stress  is  on  the  normal  distribution  for  then  the  used  mathe- 
matical model  is  the  most  efficient.  The  law  on  normal  distribution  of  data  is  most 
frequently  met  in  practice.  The  fact  that  some  experimental  results  do  not  submit  to 
this  law  is  not  upsetting  as  by  mathematical  transformations,  given  in  section  1.5, 
such  results  may  be  brought  down  to  the  normal  distribution  law.  Equality  of  ran- 
dom-value variances  is  of  particular  importance  in  experiments  with  a minimal 
number  of  runs  or  design  of  experiments  due  to  their  confidence  level.  This  condi- 
tion is  fulfilled  if  the  variance  of  one  trial  is  equal  to  the  same  variance  of  any  other 
trial.  This  variance  equality  is  checked  by  tests  from  section  1.5.  In  the  case  of 
inequality,  it  is  solved  by  identical  transformations,  same  as  for  the  normality  of  data 
distribution.  These  checks  may  be  easily  performed  since  replication  of  trials  is 
available  and  replicated  trials  are  a principle  of  design  of  experiments. 

One  assumption  of  regression  analysis  is  the  increased  precision  of  measuring  or 
fixing  a factor.  When  measuring  or  fixing  a factor,  such  conditions  are  recom- 
mended where  a factor  measurement  error  is  incomparably  smaller  when  compared 
to  an  error  in  determining  a response. 

Randomization  is  also  an  important  idea  in  the  design  of  experiments.  It  has  to 
do  with  the  random  sequence  of  doing  trials  so  as  to  annul  the  influence  of  system- 
atic factors,  which  are  difficult  to  stabilize  and  control.  In  this  way  one  of  the  main 
concepts  of  classical  experiment,  having  to  do  with  the  necessity  of  fixing  distur- 
bance factors,  is  disrupted.  Randomization  is  the  means  used  to  eliminate  any  bias 
in  the  experimental  units  and/or  treatment  of  combinations-trials.  If  the  data  are 
random  it  is  safe  to  assume  that  they  are  independently  distributed.  Errors  asso- 
ciated with  experimental  units,  which  are  adjacent  in  time  or  space  will  tend  to  be 
correlated,  thus  violating  the  assumption  of  independence.  Randomization  helps  to 
make  this  correlation  as  small  as  possible  so  that  the  analyses  can  be  carried  out  as 
though  the  assumption  of  independence  were  true. 

The  idea  of  the  concept  of  successiveness  in  doing  an  experiment  is  as  follows. 
Empirical  research  should  consist  of  separate  successive  stages  or  series  of  trials  and 
not  of  designing  a complete  experimental  research  in  advance.  An  active  experiment 
should  have  the  property  of  successiveness,  or,  each  next  stage  is  projected  and  de- 
signed based  on  the  results  of  previous  trials. 

Optimality  of  using  the  factor  space  for  an  adequate  multifactor  experiment 
means  an  increase  in  experiment  efficiency  proportional  to  the  increase  in  the  num- 
ber of  its  factors. 

The  estimate  precision  of  a polynomial  model  regression  coefficients  rises  with 
an  increase  in  the  number  of  factors,  because  the  diameter  of  the  sphere  of  factor 
space,  within  which  variation  limits  of  each  factor  lie,  also  increases. 


162  | II  Design  and  Analysis  of  Experiments 

The  concept  of  information  compactness  refers  to  the  result  analysis  of  a de- 
signed experiment.  This  means  that  final  results  do  not  require  a large  number  of 
tables  and  graphs. 

The  concept  of  statistical  estimates  refers  to  the  threshold  or  significance  level 
where  the  estimate  of  a parameter,  model  or  solution  is  either  accepted  or  rejected. 

Finally,  it  should  be  pointed  out  once  again  that  obtaining  as  precise  and  com- 
plete information  on  a studied  chemical  or  physical  system  as  possible,  with  a mini- 
mal number  of  experiments  and  the  lowest  possible  expenses,  is  the  necessary  con- 
dition for  efficient  research  work.  Therefore,  application  of  modern  mathematical 
and  statistical  methods  in  designing  and  analyzing  experimental  results  is  a real 
necessity  in  all  fields  and  phases  of  work,  starting  with  purely  theoretical  considera- 
tions of  a process,  its  research  and  development,  all  the  way  to  designing  equipment 
and  studying  optimal  operational  conditions  of  a plant. 

All  empirical  research  methodologies  may  be  divided  into  two  large  groups: 

• classical  or  passive, 

• active  or  statistically  designed. 

Classical  design  of  experiments-one  factor  at  a time 

Experiments  may  be  designed  to  investigate  one  factor  at  a time  so  that  all  other 
independent  variable-factors  are  held  constant.  This  is  the  so-called  classical  experi- 
mental design.  A classical  experiment  means  researching  mutual  relationships  be- 
tween variables  of  a system,  under  “specially  adapted  conditions". 

Let  us  observe  an  example  of  system  research  where  the  effects  of  k factors  on  p 
levels  are  to  be  determined.  As  we  mention  above,  the  classical  system  of  experi- 
menting requires  each  factor  to  be  tested  at  p levels  while  others  are  kept  constant  at 
chosen  fixed  values.  The  total  number  of  trials  to  be  done  by  this  scheme  is: 

N=k(p-1)+1  (2.2) 

Assume  we  have  the  production  in  a chemical  reactor  whereby  the  product  yield  y 
is  essentially  affected  by  three  factors:  Xj  reaction  mixture  temperature,  X2  pressure 
in  reactor  and  X3  time  of  reaction.  If  all  factors  are  changed  at  two  levels  (p=2)  then 
the  research  program  is  encompassed  by  four  trials  (N=4).  The  lower  level  factor  val- 
ues are  marked  by  the  symbol  and  the  upper  ones  by  “+”.  The  conditions  of 
doing  each  run  are  shown  in  Table  2.1. 

Table  2.1  Experimental  combinations 


Number  of 
trials 

Factor  level  combinations 

y 

Remark 

X, 

X2 

X3 

1 

- 

- 

- 

yi 

Reference  run 

2 

+ 

- 

- 

Y2 

3 

- 

+ 

- 

Yt, 

4 

- 

- 

+ 

Ya 
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After  realizing  each  trial,  it  is  possible  to  determine  factor  effects  on  product 
yields: 

EXi=y2-yi;  temperature  effect  on  yield; 

EX2=y3-yi;  pressure  effect  on  yield;  (2.3) 

EX3=y4-yi;  time  effect  on  yield; 

Based  on  data  analysis  one  can  conclude  that: 

• lack  of  experimental  error; 

• lack  of  interaction  effects; 

• the  result  of  referential  trial  (y3)  is  overestimated  for  it  is  used  three  times  in 
determining  the  effects. 

Based  on  this  kind  of  analysis  the  researcher  may  decide  to  check  the  precision  of 
the  results  by  repeating  the  trials.  Precision  is  the  repeatability  of  the  results  of  a 
particular  experiment.  However,  apart  from  the  possibility  of  determining  experi- 
mental error,  the  trial  repeating  does  not  offer  new  information. 

Statistical  design  of  experiments-DOE 

The  mentioned  deficiencies  of  the  classical  design  of  an  experiment  may  efficiently 
be  removed  and  overcome  by  statistical  design  and  calculation  of  obtained  results  by 
means  of  methods  of  statistical  analysis. 

If  for  the  studied  example,  instead  of  repetition,  the  experimental  program  is 
expanded  by  additional  combinations  of  factor  levels-trials,  as  shown  in  Table  2.2, 
we  get  an  experiment  with  eight  trials. 

Table  2.2  Additional  experimental  combinations 


Number  of 
trials 

Factor  level  combinations 

y 

Remark 

X3 

X2 

X3 

5 

+ 

+ 

- 

Y5 

6 

+ 

- 

+ 

Y6 

7 

- 

+ 

+ 

Y7 

8 

+ 

+ 

+ 

ys 

A complete  design  of  experimental  research,  which  includes  all  eight  design 
points,  is  one  of  the  best-known  statistical  experimental  designs,  the  so-called  full 
factorial  design. 

Factorial  design  of  experiments,  combined  with  statistical  methods  of  data  analy- 
sis, offers  wider  and  more  differentiated  information  on  the  system,  while  conclu- 
sions are  of  greater  usability.  The  results  of  all  the  eight  runs  in  the  analyzed  exam- 
ple serve  for  determining  the  factor  effects,  with  seven  trials  being  independent  pos- 
sibilities of  testing  the  effects  and  one  serving  for  their  comparison  with  the  chosen 
fixed  values.  Three  out  of  seven  independently  determined  factor  effects  serve  for 
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finding  its  basic  effect:  EXi;  EX2  and  EX3  and  the  other  four  to  determine  their  mutual 
interactions:  Exlx2  EX1X3  Ej^q  and  EX1X2X3,  following  these  expressions: 


Table  2.3  Full  factorial  design2 


Number  of 
trials  __ 

Factor  level  combinations 

X,  x2  x3 

Response 

_ y 

Remark 

1 

- 

- 

- 

yi 

Reference  trial 

2 

+ 

- 

- 

J2 

3 

- 

+ 

- 

Yl 

4 

+ 

+ 

- 

y4 

5 

- 

- 

+ 

ys 

6 

+ 

- 

+ 

yg 

7 

- 

+ 

+ 

yi 

8 

+ 

+ 

+ 

ys 

EXi=(y2+y4+y6+y8)/4-(yi+y3+y5+y7)/4 

Ex2=(y3+y4+y7+y8)/4-(yi+y2+y5+y6)/4 

Ex3=(y5+y6+y7+y8)/4-(yi+y2+y3+y4)/4 

EXiX2=(yi+y4+y5+ys)/4-(y2+y3+y6+y7)/4  (2.4) 

Ex2X3=(yi+y2+y7+y8)/4-(y3+y4+y5+y6)/4 

EXiX3=(yi+y3+y6+y8)/4-(y2+y4+y5+y7)/4 

EXiX2X3=(y2+y3+y5+ys)/4-(yi+y4+y6+y7)/4 

As  has  been  said  before,  in  the  case  of  the  classical  experiment,  which  with  repli- 
cation has  N=2x4=8  trials,  the  results  of  three  trials  are  used  for  establishing  basic 
factorial  effects,  one  as  a referential  value  and  the  remaining  four  for  determining 
experimental  error.  The  advantages  of  factorial  design  are  evident  and  they  prove  to 
be  the  best  in  experiments  with  a larger  number  of  factors.  Basic  advantages  of 
design  of  experiment  when  compared  to  the  one  factor  at  a time  classical  one,  are  as 
follows: 

• it  makes  possible  asserting  lawfulness  of  phenomena  in  the  experimental 
space-domain  as  a whole,  and  hence  drawing  conclusion  on  results  is  of 
wider  usability  value; 

• it  offers  wider  possibilities  of  testing,  the  effects  of  factor  varying  on  final 
result,  since  results  of  all  trials  are  used  for  calculation  of  the  effects; 

• it  enables  establishing  the  size  of  factor  interactions,  moreover,  this  is  the 
only  way  such  interactions  may  be  determined; 

• data  accuracy  from  an  active  experiment  is  reached  through  considerably 
fewer  statistically  designed  trials,  i.e.  at  the  same  number  of  trials  an  active 
experiment  offers  more  complete  and  precise  information; 

• the  final  research  objective  set  up  is  achieved  in  a systematic,  well  thought 
out  and  organized  way  in  a short  time  with  considerably  fewer  runs  and  the 
lowest  possible  material  costs; 
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• in  a classical  experiment  one  is  usually  unable  to  take  into  account  uncon- 
trolled changes,  errors  resulting  from  material  variation,  bias  errors  and 
errors  resulting  from  the  sequence  of  testing; 

• a classical  experiment  has  a lack  of  information  about  experimental  error, 
which  serves  as  an  estimate  of  the  lack  of  fit  for  the  obtained  mathematical 
model; 

• when  doing  a classical  experiment  one  obtains  clumsy  tables  and  graphs  that 
are  difficult  for  a simultaneous  analysis; 

• an  active  experiment  eliminates  one  of  the  main  assumptions  of  classical 
experimentation  having  to  do  with  the  necessity  of  fixing  disturbance  factors. 

A researcher  is  consciously  suggested  to  make  random  situations-randomiza- 
tions  so  that  hard  to  stabilize  and  uncontrolled  factors  could  have  a random 
character; 

• an  active  experiment  has  a successive  property,  or,  each  next  stage  is  pro- 
jected and  designed  based  on  results  of  a previous  series  of  trials; 

• an  active  experiment  changes  the  way  of  the  experimenter’s  reasoning, 
increases  his  intuition  and  makes  him  active  in  projecting  further  stages  of 
an  experiment,  requires  use  of  empirical  and  scientific  background; 

• a classical  experiment  is  a special  case  of  an  active  statistical  design  of  experi- 
ment where  the  individual  effect  of  certain  factors  on  system  response  is 
tested.  From  a mathematical  point  of  view,  a classical  experiment  offers  par- 
tial effect,  while  the  active  one  gives  the  total  effect,  for  with  it  all  factors  are 
simultaneously  varying  in  the  experiment. 

Table  2.4  shows  basic  statistical  designs  for  all  kinds  of  quantitative  and  categori- 
cal/qualitative  factors. 

Table  2.4  Basic  DOE  Designs 


Experimental  design 

Factors 

Application 

Simple  comparative  designs 

Categorical/qualitative  and 
quantitative 

Check  of  method,  testing  of 
single  factor  effect 

Random  blocks  and  Latin 

Differences  between  batches, 

Calculation  of  effects  with 

squares 

treatments,  samples 

elimination  of  inequality  of 
experimental  conditions 

Fractional  replicate  designs 

Categorical/qualitative  and 
quantitative 

Screening  of  factors 

Random  balance  design 

Categorical/qualitative  and 
quantitative 

Screening  of  factors 

Full  factorial  designs 

Categorical  / qualitative, 
quantitative  and  combined 

Choice  of  factors,  calculation  of 
main  effects  and  interactions 

Central  composite  rotatable 
designs 

Quantitative 

Regression  models  of  second 
order 

Central  composite  orthogonal 
designs 

Quantitative 

Regression  models  of  second 
order 
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Experimental  design 

Factors 

Application 

Simplex  lattice  design 

Quantitative 

Mixture  problems,  regression 
models  of  second  and  higher 
orders 

Extreme  vertex  design 

Quantitative  with  constraints 

Mixture  problems,  regression 
models  of  second  and  higher 
orders 

Hartley's,  Kono’s,  Kifer’s, 
D-Optimal 

Quantitative 

Regression  models 

Higher-order  designs 

Quantitative 

Regression  models  of  higher 
order 

2.1 

Preliminary  Examination  of  Subject  of  Research 

2.1.1 

Defining  Research  Problem 

Experimental  research  of  the  system  must  be  preceded  by  preliminary  examination  of 
the  subject  of  research  aimed  at  obtaining  information  necessary  for  defining  the 
research  objective. 

The  modern  approach  to  experimental  research  presupposes  that  to  obtain  the 
optimal  solution  it  is  necessary  to  define  the  research  problem  correctly.  It  should  be 
defined  in  such  a way  to  enable  the  most  efficient  algorithms  and  methods  of  a de- 
signed experiment.  For  a concrete  definition  of  a research  problem,  it  is  necessary  to 
formulate  clearly  its  objective,  choose  the  research  subject  model  and  analyze  its  pre- 
liminary information.  Special  attention  should  be  paid  to  the  setup  conditions  in  the 
problem  with  reference  to  the  capability  of  the  available  experimental  plant.  The 
next  step  is  the  choice  of  preliminary  design  of  experiment.  When  choosing  it  one 
must  take  into  account  all  the  singularities  of  the  research  problem  and  all  known 
design  of  experiments  must  be  analyzed  in  this  respect.  The  design  or  method  that 
is  most  efficient  in  the  particular  analyzed  case  is  chosen.  The  methods  and  designs 
of  experiments  for  further  research  stages  will  be  considered  after  completing  and 
analyzing  the  previous  research.  As  Fig.  2.1  shows,  the  new  approach  to  experimen- 
tal research  requires  long  prior  preparation  of  the  experiment  aimed  at  increasing 
experimentation  efficiency. 

The  research  objective  may  be  defined  if  the  research  subject  or  optimization  sub- 
ject is  defined,  if  its  requirements  are  known  and  if  there  exist  interactions  that 
change  the  quality  of  a research  subject  with  the  change  of  requirements. 

The  next  step  is  choice  of  research  subject  model.  It  has  been  said  before  that 
design  of  experiments  rests  on  cybernetic  concepts  about  the  research  subject.  A 
“black  box>  is  therefore  recommended  as  the  research  subject  model,  which  will  be 
affected  by  various  controllable  factors.  The  defining  principles  of  such  a model  cor- 
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Figure  2.1  Block  diagram  of  experimental  research 


respond  to  the  researcher’s  preliminary  knowledge  on  insufficient  awareness  of  the 
mechanism  of  multifactor  research-problem  phenomenon. 

Figure  2.2  shows  the  black-box  model.  The  inlets  indicated  by  arrows  X^  X2,...,  Xjj 
are  the  possibilities  of  affecting  the  research  subject.  The  outlet  arrows  ylt  y2,...,  ym 
or  outlets  are  responses,  optimization  criteria  or  aim  functions. 
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Z1  Z2  ZP 


Figure  2.2  Black  box  model 

Input  variables  are  controllable,  uncontrollable  and  disturbance  variables.  Controlla- 
ble variables  or  factors  X1;  X2,...,  are  variables,  that  can  be  directed  or  that  can 
affect  the  research  subject  in  order  to  change  the  response.  They  can  be  numerical 
(example:  temperature)  or  categorical  (example:  raw  material  supplier).  Uncontrolla- 
ble variables  Z1;  Z2,...,  Zp  are  measured  and  controlled  during  the  experiment  but 
they  cannot  be  changed  at  our  wish.  They  can  be  a major  cause  for  variability  in  the 
responses.  Other  sources  of  variability  are  deviations  around  the  set  points  of  the 
controllable  factors,  plus  sampling  and  measurement  error.  Furthermore,  the  sys- 
tem itself  may  be  composed  of  parts  that  also  exhibit  variability.  Disturbance,  non 
controlled  variables  Wi,  W2,...,  Wq  are  immeasurable  and  their  values  are  randomly 
changed  in  time. 

Factors  may  have  associated  values  called  levels  of  variations.  Each  state  of  a black 
box  has  a definite  combination  of  factor  levels.  The  more  different  states  of  the  black 
box  that  exist,  the  more  complex  is  the  research  subject.  Formalization  of  prelimi- 
nary information  includes:  analysis  of  reference  data,  expert  opinions  and  use  of 
direct  data,  which  enables  correct  selection  of  response,  factors  and  null  point  or  cen- 
ter of  experiment.  Factor  limitations  are  also  defined  at  this  stage.  If  the  research  is 
linked  with  several  following  responses,  then  response  limitations  also  have  to  be 
analyzed.  The  next  phase  refers  to  defining  the  research  problem.  When  defining 
this  problem  one  must  keep  in  mind  the  research-subject  model,  and  in  a general 
case  it  is  Eq.  (2.1)  that  defines  the  link  between  the  inlet  and  outlet  of  the  black  box. 
Defining  the  research  problem  is  possible  only  now  when  its  aim  has  been  deter- 
mined, the  criteria  established,  the  factors,  limitations  and  null  point  defined.  The 
problem  is  a simple  one  when  only  one  response  or  optimization  criterion  is  in 


2.7  Preliminary  Examination  of  Subject  of  Research  j 169 

question.  In  the  case  of  several  optimization  criteria  or  multiple  response  optimization 
the  problem  becomes  very  complex. 

Defining  a research  objective  by  its  difficulty  may  be  divided  into  three  levels: 

• screening  factors  regarding  statistical  significance  of  their  effect  on  response; 

• obtaining  a mathematical  model  of  research  subject; 

• optimization  of  the  research  subject. 

Optimization  of  a research  subject  is  the  hardest  research  problem.  It  should 
immediately  be  noted  that  different  optimization  problems  appear  in  practice.  In 
most  cases  extreme  problems  are  present,  problems  of  searching  for  extremes 
(minima  and  maxima)  of  a response  function  in  the  case  of  one  response  and  with 
factor  limitations.  Most  such  problems  have  to  do  with  finding  the  maxima  of  outlet 
and  minima  of  inlet  parameters.  There  are  situations  too  where  response  improve- 
ment with  regard  to  initial  state  in  null  point  is  required.  Often,  there  is  a demand 
for  finding  the  local  optimum  if  there  are  more  of  these. 

Finding  the  mathematical  model  of  the  research  subject  is  the  lower  level  of  a 
research  objective.  It  is  obligatory  for  a large  number  of  problems.  This  obligation 
comes  after  the  end  of  factor  screening  or  after  finding  the  optimum.  The  general 
form  of  the  research  subject  mathematical  form  is: 

y=(p(X1,X2,...,Xk)  (2.5) 

where: 

y is  response,  optimization  criterion,  value  that  is  measured  during  the  experiment; 

X2,  X2,...,  Xk-are  controllable  factors  that  are  changed  during  the  experiment. 

The  aim  function  may  in  this  case  be  called  response  function  for  it  is  literally  the 
response  to  factor  change.  Geometrically,  a response  surface  corresponds  to  a 
response  function. 

It  has  been  said  before  that  we  use  polynomial  models  in  the  design  of  experi- 
ments. Therewith  we,  in  principle,  approximate  the  response  function  (2.5)  by  a 
polynomial. 

k k k 

y = P0  + EP^  + E IWj  + E P„^  + - (2.6) 

i—1  ij=  1 i=  1 

where: 

Po,  Pi,  Pij,  Ph  are  theoretical  regression  or  polynomial  coefficients. 

Based  on  experimental  values,  the  real  regression  coefficients  are  estimated,  so 
that: 

k k k 

y=b o + E + E bijXiXj  + J2  b;;X;;  + ...  (2.7) 

i—1  ij=  1 i—1 

where: 

y is  predicted-calculated  response  value, 
b0,  bi;  bjj,  b;;  are  real  regression  coefficients. 

From  regression  coefficient  values  one  may  estimate  the  factor  effects  or  the 
degree  of  influence  of  associated  factors  on  response.  Geometrically,  Eq.  (2.7)  is  the 
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response  surface  in  the  k-dimensional  region.  This  basic  surface  may,  for  a more 
detailed  study  of  the  optimum  region,  be  cut  two-dimensionally  for  constant 
response  values,  with  the  idea  of  obtaining  a contour  graph,  which  can  easily  be  pre- 
sented geometrically  in  a plane. 

Lack  of  fit  of  the  obtained  model  has  to  be  statistically  checked,  so  that,  if  needed, 
the  polynomial  degree  may  be  augmented.  Knowing  the  mathematical  model  of  the 
research  subject  for  several  responses  is  a prerequisite  in  solving  optimization  with 
multiple  responses.  The  computation  of  this  is  solved  geometrically  or  by  use  of  com- 
puters and  the  method  of  linear  algebra. 

2.1.2 

Selection  of  the  Responses 

Selection  of  the  responses  is  one  of  the  most  important  problems  of  a preliminary 
study  of  the  research  subject,  since  a correct  definition  of  research  objective  means 
correct  selection  of  the  responses.  An  incorrect  selection  of  the  responses  annuls  all 
further  research  activities.  Depending  on  the  subject  and  research  objective,  optimi- 
zation parameters  or  responses  may  be  quite  different.  To  formalize  the  procedure 
of  selection  of  the  responses,  with  no  intention  of  being  detailed  and  complete,  Fig. 
2.3  gives  the  block  diagram  of  the  most  frequently  used  optimization  parameters. 

This  block  diagram  includes  the  most  frequently  used  responses  in  practice  and  it 
can  help  the  researcher  to  find  his  way  in  a real  situation.  Real  situations  are  by  rule 
very  complex  and  usually  require  simultaneous  analysis  of  several  system  responses. 
Each  research  subject  may,  in  principle,  be  characterized  by  a population  or  any 


Figure  2.3  Block  diagram  of  response  selection 
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other  response  sub  population,  given  in  Fig.  2.3.  Optimization  of  such  a research 
subject  may  be  done  only  when  a unique  optimization  parameter  has  been  selected. 

In  such  a case,  all  other  responses  are  not  optimization  parameters  but  are  taken  as 
constraints.  The  other  way  is  to  make  one,  so-called  general  response,  from  all  ana- 
lyzed responses. 

For  a research  subject  parameter  to  be  a response,  it  has  to  fulfill  certain  condi- 
tions. A response  should  be: 

• quantitative, 

• singular, 

• statistically  effective, 

• universal, 

• physically  realistic, 

• simple, 

• easily  measurable 

System  response  should  be  quantitative,  i.e.  its  property  must  be  its  ability  to  be 
expressed  by  ciphers.  Its  manner  of  measurement  in  any  combination  of  factors, 
which  determine  it,  must  also  be  known.  The  sum  of  values  taken  by  a response  is 
called  the  domain  of  response.  The  domain  of  optimization  parameter  determination 
may  be  continuous  and  discrete,  limited  and  unlimited.  A chemical  reaction  yield,  for 
example,  is  a continuous,  limited  response  for  it  changes  continually  in  a limited 
range  from  0 to  100%.  The  number  of  rejected  products  and  the  number  of  plant 
damages  are  examples  of  discrete  and  (on  one  side)  limited  regions  of  response 
determination.  When  it  is  impossible  to  determine  a response  quantitatively,  we  use 
the  method  of  ranking.  By  this  method,  definite  estimates  or  ranks  are  corresponded 
to  an  optimization  parameter  by  a predetermined  defined  scale.  The  ranked 
response  obtained  has  a discrete  limited  determination  region.  A rank  is  a quantita- 
tive response  estimate  with  a definite  degree  of  subjectivity,  i.e.  it  is  associated  with 
qualitative  response  meanings.  For  any  physically  measurable  response  it  is  possible 
to  make  up  a response  with  ranks.  Thereby  one  has  to  keep  in  mind  the  fact  that  the 
rank  method  gives  a less  sensitive  response,  which  makes  studying  finer  effects 
impossible. 

Singularity  of  response  is  such  a property  of  a quantitative  parameter  where  one 
and  only  one  response  value,  with  precision  up  to  the  size  of  experimental  error,  cor- 
responds to  a definite  factor  combination.  It  is  obvious  that  the  opposite  is  not  valid, 
for  several  factor  combinations  may  correspond  to  one  response  value. 

Besides  the  two  mentioned  properties,  an  optimization  parameter  should  also  be 
statistically  effective.  This  response  property  is  brought  down  to  the  choice  of  opti- 
mization parameter  with  the  highest  possible  precision  of  determination.  When  this 
response  precision  is  insufficient,  the  number  of  trials  is  increased. 

The  universality  of  optimization  parameter  means  a many-sided  and  total  charac- 
terization of  a research  subject.  With  regards  to  universality,  technological  optimiza- 
tion parameters  are  not  universal  enough  for  they  do  not  include  a property  such  as 
cost  efficiency  of  a process.  General  optimization  parameters  have  universality,  as 
they  are  a function  of  the  necessary  number  of  individual  properties. 
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Another  desire  for  an  optimization  parameter  is  also  to  have  physical  sense,  to  be 
simple  and  easy  to  measure.  The  physical  sense  of  a response  has  to  do  with  result 
interpretation,  and  simplicity  and  ease  of  measurement  when  doing  an  experiment. 

With  regards  to  the  research  subject  as  a system,  which  may  consist  of  several 
subsystems,  it  should  be  kept  in  mind  that  in  that  case  we  can  talk  about  local  opti- 
mization parameters.  Discovering  local  optima  often  does  not  mean  that  we  have 
the  optimum  for  the  whole  system. 

Apart  from  the  analyzed  requirements  to  be  fulfilled  by  an  optimization  para- 
meter, one  should  also,  when  choosing  the  response,  keep  in  mind  the  fact  that  this 
parameter  affects,  up  to  a point,  the  choice  of  the  research  subject  model.  Economic 
parameters  are  by  their  nature  additive,  so  that  they  can  be  easily  modeled  by  simple 
functions,  which  is  not  applicable  to  physical  and  chemical  responses. 


2. 1.2.1  Subject  of  Research  with  Several  Responses 

Research  problems  with  one  response  undoubtedly  have  an  advantage.  In  practice, 
however,  we  mostly  meet  research  subjects  with  several  responses,  which  often 
means  a literally  large  number  of  responses.  Thus,  for  example,  when  producing 
rubber,  plastic  and  other  composite  materials  one  must  take  into  account  responses 
such  as:  physical-chemical,  technological,  economic,  mechanical  (tensile  strength, 
elongation,  module,  etc.)  and  others.  One  can  define  the  mathematical  model  for 
each  of  the  mentioned  responses  but  simultaneous  optimization  of  several  func- 
tions is  mathematically  impossible. 

In  such  cases  we  usually  do  the  optimization  by  one  response,  which  by  the  defi- 
nition of  the  research  objective,  is  the  most  important,  while  for  others  we  impose 
constraints.  A useful  thing  in  such  situations  is  to  find  a possibility  of  reducing  the 
number  of  responses.  This  is  where  correlation  analysis  comes  in.  By  means  of  cor- 
relation analysis  one  should  determine  correlation-coefficient  pairs  between  all  pos- 
sible responses. 

If  one  response  is  marked  y1;  and  the  other  one  y2  and  if  the  number  of  runs  they 
are  measured  in  is  N,  then  the  correlation  coefficient  in  case  of  u=l,  2,...,  N number 


of  trials,  is 

given  by  expression: 

' N 

' N 

rMi  = 

E (yiu  -?i)(y2u  -h) 

u—  1 

/ 

E (Yiu  — Yi ) ( Yiu-Yi ) 

W=1 

where: 


N N 

ft  = E Yiu/x  y2  = E Yiu/n 

U—  1 U—  1 


From  the  correlation  analysis  it  is  known  that  the  correlation-coefficient  value  lies 
between  -1  and  +1.  If  an  increase  in  the  value  of  one  response  causes  the  other  one 
to  rise,  their  correlation  coefficient  has  a positive  value.  The  closer  a correlation  coef- 
ficient value  is  to  one,  the  more  the  value  of  one  response  depends  on  the  value  of 
the  other  one,  i.e.  there  is  a linear  connection  between  responses  so  that  only  one 
response  may  be  followed  on  the  actual  research  subject.  It  should  be  noted  once 
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again  that  the  correlation  coefficient  has  a clear  meaning  only  in  the  case  of  the  line- 
ar relationship  and  normal  distribution  of  the  parameters. 

At  a high  correlation  coefficient  value,  either  of  the  two  analyzed  responses  may 
be  discarded  as  it  adds  no  new  information  on  the  subject  of  research.  Our  sugges- 
tion is  to  eliminate  the  response  that  is  either  hard  to  measure  or  its  physical  inter- 
pretation is  difficult. 

Design  of  experiments  insists  on  measuring  all  responses  and  then,  by  means  of 
correlation  analysis,  research  subject  models  for  the  least  possible  number  of 
responses  or  for  general  response  are  made  up.  This  does  not  mean  that  there  are  no 
cases  in  practice  when  correlated  responses  are  used. 

Summary 

Problems  of  choosing  responses  of  complex  research  subjects  have  been  analyzed. 

The  optimization  parameter  is,  in  fact,  a reaction  or  response  to  factor  level  changes 
that  define  the  status  of  a research  subject.  Responses  may  be  economic,  technoeco- 
nomic,  technical-technological,  statistical,  psychological,  etc.  A response  should  be 
quantitative,  singular,  statistically  effective,  universal,  physically  real,  simple  and 
easily  measurable.  For  responses  with  no  quantitative  measurement,  the  ranking 
method  is  used.  Out  of  all  responses  typical  for  a research  subject,  only  one  or  a gen- 
eral response  is  taken.  Other  responses  are  used  as  constraints. 

2. 1.2. 2 General  Response 

It  is  difficult  to  single  out  one  response  as  the  most  important  one  out  of  a large 
number  of  responses  that  characterize  a research  subject.  When  this  happens  we 
have  the  situation  described  in  the  previous  chapter.  A harder  problem  is  to  make 
up  one,  a so-called  general  response  [1]. 

Each  response  has  its  physical  sense  and  its  dimension.  To  join  such  models,  it  is 
first  necessary  to  introduce  a non  dimensional  scale  for  each  response.  The  scale 
must  be  of  the  same  kind  for  all  responses  that  are  generalized.  The  choice  of  the 
scale  is  not  a routine  job  and  it  depends  on  preliminary  information  we  have  about 
the  responses  and  on  the  precision  which  is  required  from  the  general  response. 

After  choosing  the  non  dimensional  scale  for  each  response,  one  should  define 
the  rules  of  combining  partial  responses.  A unique  rule  or  algorithm  does  not  exist. 

Simple  general  response 

Assume  that  a research  subject  is  characterized  by  n partial  responses  yu(u=l,  2,..., 
n)  and  that  each  of  these  responses  is  measured  in  N trials.  Then  the  value  of  the  u- 
responses  in  the  i-th  run  is  yui  (i=l,  2,...,  N).  Each  of  the  given  responses  yu  has  its 
physical  interpretation  and  its  dimension.  If  we  introduce  the  non  dimensional  scale 
with  only  two  values  0 and  1,  the  0 would  correspond  to  all  those  values  of  partial 
responses  that  are  unsatisfactory  by  their  quality,  and  the  1 would  correspond  exactly 
to  those  that  are  satisfactory.  The  transformed  values  of  partial  responses  according 
to  the  non  dimensional  scale  are  marked  y ui . yui  and  it  is  the  transformed  value  of 
the  u response  in  the  i-th  trial.  After  the  transformation  we  obtained  non-dimen- 
sional partial  responses  that  should  now  be  generalized.  Since  partial  responses  take 
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the  values  0 and  1,  it  would  be  logical  to  make  up  the  general  response  with  the 
same  values  0 and  1.  Thereby  the  general  response  should  have  the  value  1 only 
when  all  partial  responses  have  the  value  1.  When  only  one  of  the  partial  responses 
takes  the  value  0,  the  general  response  must  also  have  the  same  value.  For  such 
imposed  conditions  the  general  response  satisfies  this  mathematical  expression: 

- » -|  l/n 

y,=  n yui  (2.9) 

-U—l 

where: 

Y;  is  the  general  response  in  the  i-th  trial. 

n 

II  is  the  multiply  of  transformed  partial  responses  yli>y2i>***>yui* 

u=  1 

The  general  response  definition  by  the  formula  (2.9)  may  be  simplified  by  delet- 
ing the  exponent  1 /n  without  affecting  its  core. 

n 

Yi  = n hi  (2-10) 

M=1 


Example  2.1 

In  developing  an  optimal  technological  procedure  of  producing  a new  plastic  mate- 
rial, the  product  quality  had  these  seven  characteristic  responses:  yi  thermostability, 
y2  material  shining,  y3  keeping  of  properties  at  low  temperatures,  y4  elasticity  mod- 
ule at  20  °C,  y5  tensile  strength,  y6  elongation  at  break  and  y7  number  of  folds  before 
rupture. 

These  transformations  are  introduced  for  the  given  partial  responses. 

• = / 1.  \f Yu  >-  10°;  * = / 1,  ifYn  -<  20; 

yii  1 0,  ifYli  < 100;  Y2i  \ 0,  if  y2i  > 20 ; 

• = / 1-  ifv3i  -<  -18;  • = / 1-  ify4i  -<  12Q; 

Vii  \ 0,  ifYn  > -18;  Ui  \0,ify4i  > 120; 

• = / if Ysi  >-  200;  . ( 1,  if  y6i  y 200;  . f 1,  ify7i  >-  25; 

y5i  I 0,  < 200;  \ 0,  if  y6i  < 200 ; \ 0,  (f y7£  < 25 ; 

Experimental  data  of  the  nine  trials  are  given  in  Table  2.5. 

Two  general  responses  are  defined  for  a complex  characterization  of  the  material; 
the  first,  a general  response 

v r*  • • i v? 

Yi  = |yx  x y2  x ...  x y7j 

takes  into  account  the  producer’s  and  buyer’s  demands,  while  the  other  general 
response  that  considers  only  the  buyer’s  demands  has  the  form 

Yi  — |f3  x Ys  x y 7 J 
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Table  2.5  Original,  transformed  and  general  responses 


No. 

trials 

Natural,  original  responses 

Transformed,  partial  responses 

General 

responses 

yi 

y* 

V 

y« 

ys 

y* 

V 

yi 

yi 

yi 

yi 

yi 

yi 

yi 

Yi 

Y2 

1 

272 

14 

-25 

103 

215 

299 

103 

1 

i 

i 

i 

i 

i 

i 

1 

1 

2 

187 

20 

-23 

91 

179 

254 

29 

1 

0 

i 

i 

0 

i 

i 

0 

0 

3 

162 

21 

-24 

102 

216 

270 

99 

1 

0 

i 

i 

1 

i 

i 

0 

1 

4 

461 

14 

-25 

114 

198 

251 

54 

1 

1 

i 

i 

0 

i 

i 

0 

0 

5 

267 

14 

-21 

105 

208 

268 

31 

1 

1 

i 

i 

1 

i 

i 

1 

1 

6 

250 

24 

-27 

99 

220 

304 

46 

1 

0 

i 

i 

1 

i 

i 

0 

1 

7 

489 

12 

-25 

123 

201 

238 

33 

1 

1 

i 

0 

1 

i 

i 

0 

1 

8 

380 

14 

-23 

116 

230 

292 

126 

1 

1 

i 

l 

1 

i 

i 

1 

1 

9 

580 

29 

-22 

100 

215 

304 

48 

1 

0 

i 

l 

1 

i 

i 

0 

1 

Only  three  technological  procedures  may  be  recommended  by  the  first,  general 
response.  If  only  the  producer’s  demands  are  considered,  the  materials  obtained  in 
seven  trials  have  a satisfactory  quality.  If  for  each  of  the  partial  responses  we  known 
the  best-ideal  value  to  be  reached,  then  the  general  criterion  may  be  made  up  by  tak- 
ing into  account  the  given  property.  Marked  with  yu0  is  the  best  ideal  value  of 
response  u.  We  can  then  consider  the  difference  yui-yuo  as  the  measure  for  reaching 
the  ideal  value  of  partial  response.  The  given  difference  may  not  be  used  to  define  a 
general  criterion  for  two  reasons.  The  first  is  that  the  analyzed  difference  has  the 
dimension  of  the  associated  partial  response.  The  other  is  that  it  may  have  a nega- 
tive and  a positive  sign.  To  switch  to  a non  dimensional  value,  it  is  sufficient  to 
divide  the  observed  difference  by  the  associated  best  value:  (yui-yuo) /yuo- 

To  eliminate  the  sign,  it  is  sufficient  to  square  it.  In  that  case  the  general  response  is: 


Yi 


E 

u— 1 


YmzM 

Vuo 


(2.11) 


If,  in  a trial,  all  partial  responses  correspond  by  their  values  to  the  associated  ideal 
values,  then  the  general  response  has  the  value  zero  Y=0.  That  is  the  general 
response  value  that  one  should  try  to  reach  in  this  case.  The  closer  to  zero  the  better. 
The  deficiency  of  this  procedure  for  generalization  is  that  each  partial  response  in 
general  response  has  the  same  part  or  the  same  importance.  The  practice  tells  us 
that  all  responses  are  not  of  the  same  importance  but,  moreover,  are  very  different. 
The  mentioned  deficiency  may  be  removed  by  introducing  a significance  coefficient  au. 

n 

E = E 

u=  1 

so  that: 


Yu-Yuo 
Yu  o 


n 

E a„  = !;  au  y 0- 

u— 1 


(2.12) 
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To  assert  the  response  significance  degree  and  to  determine  the  significance  coef- 
ficient, use  the  method  of  expert  estimate  [2,  3].  The  analyzed  algorithms  for  constru- 
ing general  responses  have  nevertheless  been  simple.  For  more  complex  general 
responses  it  is  necessary  to  define  the  transformation  scale,  which  will  take  into 
account  finer  differences  between  partial  responses. 

The  desirability  function 

The  most  frequently  used  general  response  is  Harington’s  [1,  4]  overall  desirability 
function.  The  basis  of  this  construction  of  a general  response  is  transformation  of 
partial  responses  into  a non-dimensional  desirability  scale.  To  construct  a desirability 
scale  we  use  the  prepared,  elaborated  table  of  standard  estimates,  Table  2.6. 

Table  2.6  Standard  estimates  on  desirability  scale 


Standard 

estimates 

Desires 

Quality  of  product 

1.00 

Excellent 

The  ultimate  in  “satisfaction”  or  quality,  and  improvement  beyond  this 
point  would  have  no  appreciable  value 

1.00-0.80 

Very  good 

Acceptable  and  excellent,  represent  unusual  quality,  or  performance, 
well  beyond  anything  commercially  available 

0.80-0.63 

Good 

Acceptable  and  good,  represents  an  improvement  over  the  best  com- 
mercial quality,  the  latter  having  the  value  of  0.63 

0.63-0.37 

Satisfactory 

Acceptable  but  poor,  quality  is  acceptable  to  the  specification  limits, 
but  improvement  is  desired 

0.37-0.20 

Bad 

Unacceptable,  materials  of  this  quality  would  lead  to  failure  of  the  pro- 
ject 

0.20-0.00 

Very  bad 

Completely  unacceptable 

Partial  responses  transformed  into  the  non  dimensional  scale  are  marked 
du(u=1.2,...,n)  and  called  partial  desirability  or  individual  desirability.  As  shown  in 
Table  2.6  the  desirability  scale  has  the  range  from  0.0  to  1.0.  Two  characteristic  limit 
values  for  quality  are  within  this  range  0.37  and  0.63.  The  0.37  value  is  approxi- 
mately l/e=0.36788,  where  e is  the  basis  of  the  natural  logarithm,  and  0.63  is  1-1/e. 

Due  to  mathematical  interpretation  of  the  desirability  function,  it  is  rational,  con- 
venient and  practical  to  join  the  desired  value  d=0.37  to  any  of  the  quality  properties 
in  a product  specification,  under  the  assumption  that  limit  values  for  the  quality 
really  exist.  The  other  practical  value  of  the  desirability  function  or  the  scale  is  the 
limit  value  0.63,  i.e.  the  value  that  corresponds  to  the  best  commercial  quality  of  the 
product,  which  exists  and  is  acceptable.  The  mentioned  two  limit  values  are  geomet- 
rically two  points  of  the  curve,  which  is  described  by  the  equation. 

d=exp  [-exp  (-y)]=e  e (2-13) 

The  geometric  presentation  of  Eq.  (2.13)  is  in  Fig.  2.4.  The  desirability  scale  values 
are  inserted  on  the  ordinate  from  0 to  1.  The  response  values  of  the  coded  dimen- 
sion (y')  are  on  the  abscissa.  The  beginning  of  the  abscissa  or  its  null  is  the  exact 
point  to  which  the  ordinate  0.37  corresponds.  It  should  be  noted  that  the  point  with 
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coordinates  (0;  0.37)  corresponds  to  the  first  fold  point  of  the  curve.  The  same  may 
be  said  for  the  value  0.63.  The  chosen  curve  in  Fig.  2.4  is  adequate  to  the  real  situa- 
tions as  it  is  continuous,  monotonous,  smooth  and  besides,  the  curve  ends  are  less 
sensitive  than  the  center  zone.  The  coded  response  or  axis  y'  is,  in  principle,  divided 
into  3 or  6 ranges  with  reference  to  zero.  The  choice  of  number  of  intervals  is  impor- 
tant as  it  determines  the  curve  slope. 

Example  2.2 

Let  one  of  the  responses  be  a chemical  reaction  yield  with  limit  values  0.0%  and 
100.0%. 

Assume  that  the  100%  yield  is  equal  to  the  desirability  scale  of  value  1,  and  0.0% 
to  the  value  0. 

d 


The  choice  of  other  critical  points  depends  on  a series  of  circumstances  such  as 
result  requirements,  researcher  abilities,  etc.  Take  the  case  where  the  yield  is  50%. 
The  70%  yield  is  hard  to  imagine  as  it  may  be  impossible  due  to  side  chemical  reac- 
tions. After  such  an  assertion  it  is  clear  to  a researcher  that  even  the  70%  yield  is  the 
same  as  that  of  100%.  That,  in  fact,  is  the  second  point  of  the  researcher’s  choice 
and  its  value  on  the  desirability  scale  is  close  to  one.  The  third  point  should  limit  the 
“very  good”  region,  which  on  the  desirability  scale  is  between  0.8  and  1.0.  To  choose 
the  corresponding  response  value  for  this  point  has  so  far  been  the  hardest  job.  If  it 
is  hard  to  obtain  a 70%  yield,  then  60%  would  definitely  be  satisfactory,  and  that  is 
the  third  point  on  the  abscissa.  One  should  not  be  sure  in  this  conclusion  if  the 
experimental  equipment  for  measuring  the  yield  has  a great  error  and  it  is  unable  to 
differentiate  the  60%  and  70%.  A researcher  who  is  a greater  optimist  should  choose 
the  yield  67%  for  the  required  value.  For  the  region  of  good  results  (0.80-0.63)  he 
may  choose  the  yield  values  between  60-55%.  The  already  reached  value  in  the 
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experiment  of  50%  will  be  taken  as  the  lower  limit  of  satisfactory  results.  The  45% 
yield  is  simply  a bad  result.  The  performed  correspondence  has  been  geometrically 
shown  in  Fig.  2.4  with  I as  the  abscissa.  In  a case  when  we  dispose  with  a technolog- 
ical process  that  has  a 95%  yield,  all  this  looks  different.  In  that  case,  greater  purity 
of  the  product  may  be  demanded  and  it,  apart  from  other  measures  taken,  may 
demand  a larger  yield  of  98%.  Such  a case  has  been  geometrically  shown  in  the 
same  figure  by  the  abscissa  II.  It  should  be  added  that  such  a desirability  scale  is 
possible  only  through  precise  yield  measurements.  The  situation  is  quite  different 
when  synthesis  of  a new  product  is  in  question,  which  so  far  has  not  succeeded  in 
giving  a new  product,  even  for  identification.  At  a yield  of  2%,  for  example,  we  are 
unable  even  to  identify  the  product  and  a 10%  yield  would  be  a real  success.  This 
case  is  also  depicted  in  Fig.  2.4,  as  abscissa  III. 

A curve  of  desire  is  often  used  as  a monogram.  Thus,  in  the  case  of  I if  the  yield  is  63% 
one  obtains  a 0.9  desirability  estimate  in  Fig.  2.4.  This  procedure  of  reading  the  desirabil- 
ity scale  from  a diagram  is  often  used  in  practice.  In  case  this  method  is  not  precise 
enough,  one  uses  the  analytical  method.  This  means  that  the  coded  response  / is  read 
and  then  the  obtained  value  is  replaced  in  Eq.  (2. 13) , wherefrom  the  desirability  estimate 
is  calculated.  In  the  previous  example  only  one,  quantitative  response  has  been  ana- 
lyzed, it  being  the  chemical  reaction  yield.  The  problem  gets  harder  if  the  qualitative 
response  is  in  question.  In  both  the  first  and  the  second  case,  it  is  crucial  to  determine 
the  acceptable  and  unacceptable  quality  limits.  Hereby  one  has  to  remember  that  limita- 
tions may  be  one-sided,  yu<ymax  or  yu<ymin,  and  double-sided,  ymjn<yu<ymax.  Two  situations 
are  possible.  The  first,  a simpler  one,  is  when  the  researcher  disposes  with  information 
on  requirements  for  each  partial  response  or  has  clear  specifications  in  which  either  one 
or  both  limitations  are  defined.  Then  the  estimate  on  the  desirability  scale  d=0.37  corre- 
sponds to  ymin  if  we  have  a one-sided  limitation  or  ymax  for  yu<ymax.  In  the  case  of  a dou- 
ble-sided value  limitation  d=0.37  both  ymin  and  ymax  correspond.  In  the  other  situation, 
the  researcher  has  no  specifications  available  so  that  the  limit  values  on  the  desirability 
scale  are  determined  based  on  the  runs  done  and  the  researcher’s  intuition.  It  is  obvious 
that  in  such  cases  one  should  not  be  satisfied  with  the  researcher’s  opinion  and  intuition 
for  it  can  be  highly  subjective.  Therefore  opinions  of  several  researchers  are  used  with  a 
check  of  the  degree  of  accord  in  their  opinions  by  the  rank  correlation  method. 


Transformation  of  partial  responses  into  partial/individual  desirability  functions 

Assume  we  have  an  experiment  where  we  dispose  with  specifications  with  one  or 
two  limit  values  for  each  partial  response.  For  those  values  outside  the  limit  values 
we  have  d^O,  and  within  them  dtpl.  If  ymin  is  the  lower  limit  value  of  the  specifica- 
tion and  if  yu>ymin  then  the  partial  desirability  function  for  a one-sided  limitation  is: 


f 0,ifyu  ^Fmin; 

l T if  Yu  >Ynini 


(2.14) 


By  analogy  for  a double-sided  limitation  it  is: 


d, 


o ,ifyu-<  Tmin  and  yu  y ymax; 
1,  if  ymin  < yu  < ymax  i 


(2.15) 
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In  this  way  we  have  reached  the  simple  general  response,  which  has  been  analyzed 
before.  The  desirability  scale  has  come  down  to  a simple  scale  with  two  classes.  Both 
these  cases  Eqs.  (2.14)  and  (2.15)  are  shown  in  Fig.  2.5. 


Here  we  have  a very  simple  classification  on  acceptable  and  unacceptable  quality, 
which  is  rarely  met  in  practice.  Transformation  of  partial  responses  into  a partial 
desirability,  in  a large  number  of  cases  uses  Table  2.6  and  desirability  (2.13). 

For  one-sided  limitations  yu<ymax  or  yu>ymirl  , partial  desirability,  limited  on  one 
side,  is  shown  in  Fig.  2.6. 


min  (max) 

Figure  2.6  One-sided  desirability 
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Many  responses  have  such  one-sided  limitations:  tensile  strength,  strain  at  break, 
shock  toughness,  etc.  One  can  see  this  in  Example  2.2  where  for  all  given  responses 
the  limitation  yu^ymm  is  valid.  The  other  form  of  limitations  yu^ymax  is  typical  for 
responses  such  as:  humidity,  specific  weight,  content  of  valuable  ingredients,  etc. 
Double-sided  desirability  limitation  is  shown  in  Fig.  2.7. 

Double-sided  limitation  ymm^yu^ymax  is  rnet  more  seldom  than  one-sided  and  it  is 
more  complicated  for  transformation.  The  following  responses  may  be  mentioned 
as  examples  of  double-sided  limitations:  the  molecule  weight  of  a material,  bulk 
density,  etc.  The  transformation  in  Fig.  2.7  is  mathematically  given  as: 

d = (2.16) 


where: 

e is  the  constant  of  natural  logarithm  e=2. 71828; 
n is  the  positive  number  (0<n<°°); 

y'  is  the  linear  transformation  of  the  property  variable  or  of  partial  response  yu  ; 
y'=-l,  when  y,f=ymin  is  the  lower  limit  value  of  specification  for  the  observed  quality; 
y'=+l,  when  y,j=ymax  is  the  upper  limit  value  of  the  quality  specification; 
y'-is  the  absolute  value  of  y'; 

Any  value  of  the  partial  response  (of  the  observed  quality)  marked  as  yu  may  be 
transformed  into  y'  by  means  of  the  expression: 


y _ (Vmax  +y„im ) |2  Y~] 

Fmax  finm 

Equation  (2.16)  is  a family  of  curves  for  which  it  is  valid  that: 

• they  asymptotically  approach  d=0  when  the  absolute  value  |y'|  is  above  1.0  ; 

• theypass  through  d=l/e=0.37  when  theabsolute  value  |y'  is  equal  toone  |y'|=l; 
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• they  pass  through  d=1.0  halfway  between  the  lower  and  upper  limit  values  of 
the  product  quality  specification. 


The  exponent  in  Eq.  (2.16)  determines  the  curve  slope;  when  n increases  the 
curve  approaches  faster  the  limit  case  d=0.0  outside  the  specified  limits,  and  d=1.0 
between  the  limit  values.  For  any  desirability  curve  that  corresponds  to  Eq.  (2.16),  n 
may  be  calculated  by  choosing  a d value  between  0.6  and  0.9,  by  finding  the  absolute 
value  Y and  replacing  it  in  the  equation: 


lnlnl/d 

ln|y'“ 


(2.18) 


Overall  desirability 

After  choosing  the  desirability  scale  and  after  the  transformation  of  partial 
responses  into  partial  desirability  it  is  possible  to  approach  constructing  the  general 
response  D,  which  is  called  Harrington’s  over  all  desirability  or  Harrington’s  general 
response.  To  generalize  or  switch  from  du  to  D is  possible  by  the  formula: 


D = 


1 1/» 


n d, 

U— 1 


(2.19) 


Equation  (2.19)  is  mathematically  the  geometric  mean  of  partial  responses.  The 
example  may  be  meeting  all  properties  of  the  material  with  application  require- 
ments. When  a property  of  the  material  does  not  satisfy  the  specification  (i.e.,  the 
material  is  brittle  and  fragile  at  a certain  temperature),  then  it  cannot  be  used.  If  a 
partial  desirability  is  du=0,  this  property  must  be  true  of  over-all  desirability  D=0.  On 
the  contrary,  D=1  when  and  only  when  all  partial  desirability  is  du=l(u=1.2,...,n). 
Over  all  desirability  is  highly  sensitive  to  changes  in  individual  ones.  The  principle 
of  getting  estimates  on  the  desirability  scale  given  in  Table  2.6,  apart  from  it  being 
valid  for  partial  ones,  is  also  valid  for  d1;  d2,  ...,  d,p0.63,  D=0.63,  or  if  di,  d2,  ..., 
d^O.37,  D=0.37  too,  etc.  Over-all  desirability  includes  various  partial  responses  such 
as:  technological,  techno  economic,  physical-chemical,  economic,  esthetics,  etc. 
Example  2.1  considers  construction  of  an  over  all  response  by  using  the  desirability 
scale  with  only  two  values  0 and  1. 


Example  2.3 

Based  on  data  on  partial  responses  from  Example  2.1,  make  up  Harrington’s  general 
response. 

As  one-sided  limited  partial  responses  are  in  question,  use  the  one-sided  limited 
desirability  as  given  in  Fig.  2.8. 

Transformed  partial  responses  into  partial  desirability  are  shown  in  Table  2.7.  Fol- 
lowing Harrington’s  general  response: 

D1  = [dj  x d2  x d3  x d4  x d5  x d6  x d7] 1/7 
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four  technological  procedures  obtained  good  marks,  and  five  procedures  have  been 
satisfactory.  By  general  response: 

D2  = [d3  x d5  x dy]1^3 

which  considers  only  the  buyer’s  demands,  three  procedures  were  very  good  and  six 
satisfactory.  By  comparing  the  obtained  solutions  with  those  from  Example  2.1,  it  is 
obvious  that  Harrington’s  general  response  is  finer. 

For  obtaining  coded  values  y'  three  ranges  have  been  taken  in  this  example  or 
these  codes:  -3;-2;-l;0;+l;+2;+3.  When  the  desirability  curve  should  be  regulated, 
this  may  be  achieved  by  changing  the  number  of  ranges.  To  enable  the  transforma- 
tion given  in  Table  2.7,  Table  2.6  should  be  completed  by  this  information: 


di 

yi 

yi 

h 

yi 

yi 

Is 

ys 

17 

1.00-0.80 

3.0 

300 

7 

-30 

80 

300 

330 

100 

0.80-0.63 

1.5 

200 

10 

-25 

90 

250 

280 

80 

0.63-0.37 

0.85 

120 

15 

-20 

105 

220 

250 

60 

0.37-0.20 

0.00 

100 

20 

-18 

120 

200 

200 

25 

0.20-0.00 

-0.50 

95 

40 

-15 

130 

150 

150 

20 

2.7  Preliminary  Examination  of  Subject  of  Research  | 183 
Table  2.7  Partial  responses,  partial  desirability  and  overall  desirability 


No. 

trial 

Partial  responses 

Partial  desirability1 

rs 

yi 

Y2 

ys 

y< 

ys 

ye 

y? 

d, 

d2 

d3 

d4 

d5 

de 

d7 

Di  1 

Mark  D2  1 

Mark 

1 

272 

14 

-25 

103 

215 

299 

103 

0.98 

0.67 

0.80 

0.71 

0.55 

0.80 

1.00 

0.77 

G 

0.76 

G 

2 

187 

20 

-23 

91 

179 

254 

29 

0.77 

0.36 

0.73 

0.79 

0.32 

0.53 

0.29 

0.50 

S 

0.41 

S 

3 

162 

21 

-24 

102 

216 

270 

99 

0.68 

0.35 

0.75 

0.72 

0.55 

0.56 

0.99 

0.63 

G 

0.74 

G 

4 

461 

14 

-25 

114 

198 

251 

54 

1.00 

0.67 

0.80 

0.47 

0.38 

0.53 

0.57 

0.60 

S 

0.56 

S 

5 

257 

14 

-21 

105 

208 

268 

31 

0.97 

0.67 

0.70 

0.63 

0.44 

0.54 

0.39 

0.60 

S 

0.49 

S 

6 

250 

24 

-27 

99 

220 

304 

46 

0.95 

0.32 

0.90 

0.70 

0.63 

0.81 

0.48 

0.65 

G 

0.62 

S 

7 

489 

12 

-25 

123 

201 

238 

33 

1.00 

0.72 

0.80 

0.35 

0.38 

0.50 

0.40 

0.53 

S 

0.50 

S 

8 

380 

14 

-23 

116 

230 

292 

126 

1.00 

0.67 

0.73 

0.44 

0.67 

0.79 

1.00 

0.73 

G 

0.79 

G 

9 

580 

29 

-22 

100 

215 

304 

48 

1.00 

0.30 

0.72 

0.70 

0.55 

0.81 

0.52 

0.63 

S 

0.60 

S 

Remark: 
G-good  mark 

Di=[di  x d2  x d3  x d4  x 
S -satisfactory  mark 

d5  x < 

lexd7]1/7 

D2=[d 

3xd5 

xd7]^ 

Overall  desirability  is  an  abstract  definition  and  therefore  some  of  its  properties 
have  to  be  analyzed,  such  as:  lack  of  fit  and  statistical  effectiveness.  It  has  been 
asserted  that  the  effectiveness  and  sensitivity  of  partial  and  overall  desirability  are 
not  lower  than  the  same  properties  of  any  technological  response.  Overall  desirabil- 
ity is  quantitative,  singular,  statistically  effective,  adequate,  etc.  It  has  found  a large 
application  in  the  research  of  polymeric  materials,  rubber  products,  etc. 

2.1 .2.3  Ranking  of  the  Qualitative  Responses 

Among  the  response  requirements  of  a research  subject  that  have  to  be  met  in  the 
first  place  is  that  it  has  to  be  quantitative.  A researcher  usually  keeps  to  this  require- 
ment, however  there  are  situations  when  it  cannot  be  met,  and  the  researcher  has  to 
deal  with  qualitative  responses.  Due  to  the  fact  that  in  the  case  of  qualitative 
responses  the  efficiency  of  experimental  research  is  reduced,  one  should  try  to  trans- 
form these  responses  into  quantitative  ones.  For  this,  one  may  use  the  transforma- 
tion of  qualitative  response  by  desirability  scale  into  partial  desirability. 

Example  2.4  [5] 

In  a full-scale  plant  for  producing  double-base  propellants  a study  was  done  to  dis- 
cover a high-energetic  propellant  with  a high  burning  rate  and  low  temperature  sen- 
sitivity. The  problem  of  making  such  a propellant  consisted  in  a high  percentage  of 
ignitions,  even  up  to  66%  of  the  total  number  of  batches.  In  the  discovery  and  the 
elimination  of  inflammability  causes  of  certain  batches  when  gelled  on  rollers,  a 
great  problem  was  the  qualitative  response  of  propellant  inflammability,  or  lack  of 
possibility  to  quantitatively  express  the  propellant  ignition  at  gelling.  The  research 
program  for  discovering  these  causes  included  eight  trials  that  were  repeated  once. 
It  should  be  noted  that  as  a correct  propellant  is  production  in  this  case  is  consid- 
ered, the  propellant  produced  after  30  passes  over  rollers  for  gelling.  The  data  of  all 
trials  are  shown  in  Table  2.8.  Do  the  ranking  of  the  qualitative  response. 
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Table  2.8  Ranking  of  the  qualitative  response 


No. 

of  trials 

1 Production-qualitative 
response 

Rank 

II  Production-qualitative 
response 

Rank 

d„ 

1 

Done  with  no  problems 

1.00 

Done  with  no  problems 

1.00 

1.000 

2 

Done  with  crackling 

0.85 

Done  with  no  problems 

1.00 

0.925 

3 

Ignition  in  22  passes 

0.58 

Ignition  in  10  passes 

0.37 

0.475 

4 

Ignition  in  17  passes 

0.44 

Done  with  no  problems 

1.00 

0.720 

5 

Done  with  no  problems 

1.00 

Done  with  crackling 

0.78 

0.890 

6 

Ignition  in  25  passes 

0.68 

Ignition  in  20  passes 

0.53 

0.603 

7 

Ignition  in  15  passes 

0.54 

Done  with  no  problems 

1.00 

0.770 

8 

Done  with  crackling 

0.92 

Done  with  no  problems 

1.00 

0.960 

Ranking  is  done  by  using  the  one-sided  desirability  given  in  Fig.  2.9. 


Figure  2.9  Response  ranking  by  desirability  scale 


Summary 

The  construction  of  the  general  response  is  linked  to  defining  one  quantitative 
response  to  a research  subject  with  several  partial  responses,  each  of  which  has  its 
own  physical  interpretation  and  dimension.  To  form  from  such  different  partial 
responses  a unique  response,  it  is  necessary  to  transform  all  partial  responses  into 
non  dimensional  values  by  a unique  scale.  It  is  therefore  necessary  when  defining  a 
general  response  first  to  choose  the  scale  for  doing  the  transformation.  The  scale 
must  be  unique  for  all  partial  responses  to  be  transformed.  The  choice  of  the  scale 
depends  on  preliminary  information  about  partial  responses  and  on  the  required 
precision  of  the  general  response. 

The  next  problem  is  choosing  the  rule  by  which  the  transformed  partial  responses 
will  be  combined  into  general  response.  There  is  no  rule,  and  the  way  to  choose  the 
combinations  is  not  defined.  Certain  approaches  that  a researcher  might  use  have 
been  presented. 
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2.1.3 

Selection  of  Factors,  Levels  and  Basic  Level 

Having  selected  the  system  response,  we  start  choosing  factors,  levels  of  the  factors 
and  center  point  of  the  design  (basic  level  or  the  null  point).  By  factor  we  understand 
the  controllable  independent  variable  that  corresponds  to  one  possibility  of  influ- 
ence on  the  object  of  research.  A factor  is  considered  defined  if  its  name  and  domain 
of  factors  are  determined.  A factor  may  take  several  values  in  this  field.  The  chosen 
factor  values,  both  qualitative  and  quantitative,  are  called  factor  variation  levels.  Fac- 
tor variation  levels  in  the  design  of  experiments  are  coded  values.  Under  factor  inter- 
val of  variation  we  understand  the  difference  between  two  factor  levels,  which  in 
their  coded  form  have  value  one.  When  selecting  the  factors  one  should  pay  atten- 
tion to  the  conditions  they  must  meet. 

Factors  should  be: 

• controllable 

• of  high  measurement  precision 

• singular 

• concordant 

• noncorrelated  linear-wise. 

The  controllable  requirement  of  factors  is  linked  to  the  possibility  of  setting  them 
on  several  levels  and  maintaining  those  levels  precise  enough.  Or,  by  changing  fac- 
tor values,  one  changes  the  research  subject  status  or  controls  the  subject. 

Factor  singularity  means  its  property  to  directly  change  the  status  of  a research 
subject,  i.e.  it  is  not  a function  of  other  factors  and  it  may  be  fixed  to  any  value  in 
the  domain  of  factors. 

Factor  concordance  is  a property  that  makes  it  possible  for  all  factor  combinations 
to  be  realized  in  an  experiment.  This  property  is  very  important  when  an  experiment 
with  several  simultaneous  factor  variations  is  designed.  It  is  not  a rare  case  where 
the  lack  of  this  property  brings  about  a change  in  defining  a research  problem, 
excluding  some  factors  from  the  experiment,  or  it  changes  the  domain  of  factors. 

The  question  of  linear  correlation  between  factors  deserves  special  attention. 
There  is  a rule  saying  that  in  the  case  of  a linear  correlation  between  factors  it  is 
impossible  to  design  an  experiment.  This  is  connected  with  the  requirement  to  keep 
in  each  design  point  of  experiment-trial  (one  combination  of  factor  levels)  each  factor 
at  a corresponding  level,  independent  from  the  others.  Besides,  in  the  case  of  a line- 
ar correlation  between  two  factors,  it  is  sufficient  for  only  one  of  them  to  be  included 
in  the  experiment,  for  inclusion  of  the  other  one  does  not  offer  any  additional  infor- 
mation on  the  research  subject.  The  optimization  problem  is  often  complicated  if  all 
the  observed  factors  can  not  be  expressed  quantitatively.  The  existence  of  categori- 
cal/qualitative factors  is  connected  with  insufficient  knowledge  of  the  researched 
phenomenon  or  subject  of  research.  Through  a better  level  of  knowledge  about  the 
research  subject,  categorical/qualitative  factors  change  into  quantitative  ones.  When 
categorical/qualitative  factors  are  present,  an  optimization  problem  may  be  solved 
in  two  ways: 
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• separately  for  each  level  of  categorical/qualitative  factor  and  then  by  compar- 
ing the  obtained  optimal  solutions; 

• simultaneously  for  all  levels  by  quantitative  defining  of  a factor  on  several  levels. 

Selection  of  one  of  these  two  ways  depends  on  the  particular  problem.  The  first  one 
mostly  gives  more  reliable  results  but  requires  a longer  time  and  is  more  costly.  When 
we  have  good  preliminary  information  on  a research  subject,  we  may  use,  in  an  experi- 
ment, the  complex  factors-  similarity  criteria,  component  concentration,  logarithms,  sim- 
plexes  of  geometric  dimensions,  etc.  [6].  When  defining  factors  it  is  important  to  take  all 
those  potential  factors  that  may  affect  the  research  subject.  If  we  forget  one  of  the  crucial 
factors,  this  eventually  may  have  very  bad  consequences  for  the  researcher.  Namely,  a 
forgotten  factor  will,  during  the  experiment,  act  randomly  taking  random  values  out  of 
the  researcher’s  control,  which  means  that  the  value  of  a trial  error  will  increase. 
When  the  forgotten  factor  remains  at  a fixed  level  we  may  infer  a false  optimum  as 
there  is  no  guarantee  that  the  fixed  level  of  the  factor  is  optimal. 

In  practice  we  are  often  faced  with  a research  subject  that  has  several  technological 
phases  and  where  the  response  is  measured  in  its  last  phase.  In  that  case,  the  subject  is 
studied  cybernetically  as  a “black  box",  like  a unique  technological  phase  with  all  the  fac- 
tors that  corresponded  to  individual  technological  phases.  We  had  no  responses  by  indi- 
vidual technological  phases  in  this  case,  but  this  may  occur.  Moreover,  response  optima 
by  individual  phases  contradict  the  general  optimum  system.  This  indicates  that  optimi- 
zation by  individual  phases  of  a research  subject  is  justified  and  possible.  In  this 
way  it  is  possible  to  incorporate  into  the  design  of  an  experiment,  factors  from  var- 
ious phases  of  a research  subject,  but  this  is  not  always  necessary. 

When  selecting  a domain  of  factors  one  should  pay  special  attention  to  choosing 
the  center  point  of  the  design  (basic  level  or  the  null  point).  The  choice  of  a null  point  is 
associated  with  selection  of  the  initial  status  of  the  research  subject  to  perform  opti- 
mization. As  optimization  is  connected  with  improvement  of  the  subject  status  in 
comparison  with  the  status  in  the  null  point,  it  is  desirable  that  the  point  is  in  the 
optimum  region  or  as  close  to  it  as  possible.  If  the  mentioned  research  was  preceded 
by  other  experiments  on  the  same  subject,  the  status  having  the  most  convenient 
response  value  is  taken  as  the  null  experiment.  The  null  point  is  quite  often  the  cen- 
ter of  the  domain  of  factors.  The  most  important  alternatives  in  selecting  the  basic 
and  null  levels  are  shown  in  Fig.  2.10. 

Having  defined  the  null  point,  we  choose  the  factor  intervals  of  variation.  The  selec- 
tion of  these  factors  means  determining  such  factor  values,  which  in  their  coded 
form  have  the  values  +1  and  -1.  When  choosing  this  factor  in  the  experimental 
domain  we  obtain  a subdomain,  symmetrical  to  the  null  point,  which  is  used  in  the 
first  experimental  phase.  When  choosing  the  factor  interval  of  variation  one  must 
keep  in  mind  the  fact  that  factor  values  corresponding  to  levels  +1  and  -1  must  be 
different  enough  from  those  that  correspond  to  the  null  level.  Therefore  in  almost 
all  cases,  the  variation  interval  (e)  is  taken  as  twice  as  large  as  the  error  fixing  factor. 
Too  large  a factor  variation  interval  is  also  a problem,  for  it  reduces  the  efficiency  of 
finding  an  optimum,  especially  in  regards  to  the  steepest  ascent  method.  On  the  con- 
trary, a small  variation  interval  does  not  present  a problem  in  practice,  since  the 


2.7  Preliminary  Examination  of  Subject  of  Research  | 187 

domain  of  factors  is  generally  known  in  advance,  including  the  information  on 
expected  order  of  the  mathematical  model.  The  variation  interval  must  not  be  too 
small,  for  in  that  case,  the  response  effects  may  not  be  registered.  Block  schemes 
are  shown  in  Figs.  2.11-2.13. 


Figure  2.10  Block  diagram  for  choice  of  center  point 


Figure  2.11  Block  diagram  of  accepting  factor  variation  intervals 
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Figure  2.12  Block  diagram  of  accepting  factor  variation  intervals 


Figure  2.13  Block  diagram  of  accepting  factor  variation  intervals 

The  presented  block  diagrams  link  the  factor-fixing  accuracy,  range  of  response 
change  and  response-surface  curvature  with  the  width  of  factor-variation  interval. 
When  selecting  a factor  variation  interval  one  should,  if  possible,  account  for  the 
number  of  factor  variation  levels  in  the  experimental  domain.  Depending  on  the 
number  of  these  levels,  are  the  experiment  range  and  optimization  efficiency. 
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In  a general  case,  a design  point  number  (number  of  trials  - different  stages  of 
research  subject)  depends  on  factor  level  number  and  is  written: 

N=pk  (2.20) 

where: 

N is  number  of  design  points  - trials; 
p is  number  of  factor  levels  and 
k is  number  of  factors. 

The  relation  (2.20)  is  correct  for  the  case  of  the  same  number  of  variation  levels  of 
each  factor.  The  minimal  number  of  factor  level  variations  is  two  and  it  is  most  fre- 
quent in  the  first  phase  of  research.  Those  are  upper  and  lower  levels  marked  as  +1 
and  -1.  Factor  variations  on  two  levels  are  applied  in  screening  experiments,  in  the 
phase  of  movement  to  the  optimum  and  when  describing  the  research  subject  by 
linear  models.  This  number  of  factor  levels  is  not  sufficient  to  obtain  second-order 
models,  for  a set  of  lines  of  different  degrees  of  curving  may  be  drawn  through  the 
two  points.  With  an  increased  number  of  factor  levels,  experimental  sensitivity  is 
raised,  but  also  the  number  of  design  points.  To  obtain  a second  order  model  it  is 
necessary  to  do  an  experiment  where  factors  vary  at  three,  four  or  more  levels.  In 
our  case,  the  number  of  factor  variation  levels  is  determined  in  accord  with  the 
research  conditions  and  the  plotted  design  of  experiment.  Hence  problems  may 
appear  when  the  research  includes  categorical/qualitative  factors  or  those  that 
change  discretely.  A categorical/qualitative  factor,  for  example,  has  no  evident  physi- 
cal sense  for  the  null  level.  This  deficiency  of  categorical/qualitative  factors  does  not 
affect  optimization  efficiency  in  the  case  of  the  linear  model.  The  situation  is  more 
complicated  when,  in  modeling  the  second  order,  one  must  account  for  categorical/ 
qualitative  factors  (a  factor  must  be  varied  at  least  at  three  levels).  Accounting  for 
these  deficiencies,  it  is  recommended  to  include  categorical/qualitative  factors  only 
in  the  screening  experiments  and  in  the  methods  of  designing  experiments,  which 
have  nothing  to  do  with  obtaining  nonlinear  models,  such  as:  analysis  of  variance, 
random  balance  method,  full-factorial  designs  on  two  levels,  etc.  Factor  selection  is 
completed  by  making  a list  of  all  factors  that  are  of  interest  in  the  researcher’s  opin- 
ion. Thereby,  factor  names  and  marks,  their  ranges,  variation  levels  and  null-point 
coordinates,  are  defined. 

It  is  important  once  again  to  note  that,  when  considering  factors,  all  variables  hav- 
ing the  least  possible  chance  to  affect  the  research  subject  are  included.  It  is  better 
in  such  a situation  to  include  more  factors,  for  the  nonessential  ones  will  be  rejected 
in  the  process  of  selection.  An  example  of  defining  factors  is  shown  in  Table  2.9: 
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Table  2.9  Selections  of  factors 


Name  and  mark  of  factor 

Variation  levels 

Variation 
interval  £ 

-2 

-1 

0 

+1 

+2 

Temperature,  °C-X1 

140 

150 

160 

170 

180 

10 

Pressure,  bar-X2 

0 

2.5 

5 

7.5 

10 

2.5 

Concentration,  g/cm3-X3 

0 

10 

20 

30 

40 

10 

Time,  min-X4 

30 

60 

90 

120 

150 

30 

Mass,  kg-X5 

100 

160 

200 

250 

300 

50 

From  the  total  number  of  noted  factors,  the  researcher  chooses  those  that  may  be 
varied  during  the  experiment  while  he  keeps  the  others  at  constant  levels.  When  the 
number  of  selected  factors  is  more  than  seven,  a possibility  of  doing  the  screening 
experiments  has  to  be  considered.  For  a relatively  small  number  of  factors,  an 
experiment  is  done  to  reach  the  optimum  or  to  obtain  the  mathematical  model  of 
the  research  subject,  depending  on  what  the  objective  of  the  researcher’s  problem  is. 

Finally,  we  call  the  extreme  values  those  factors  can  take,  without  changing  the 
physical-chemical  properties  of  research  subject,  physical  limits  of  factors,  and  the 
interval  Xlmax-Xlmin  domain  of  factors.  Geometric  interpretation  is  shown  in  Fig.  2.14. 


Domain  of  factors  is  marked  “O”.  The  figure  clearly  shows  that  intervals  of  factor 
variations  are  part  of  the  domain  of  factors  when  the  optimization  problem  is  being 
solved.  This  is  necessary  in  order  to  realize  movement  towards  optimum  in  this 
domain.  The  experiment  domain  is  in  the  same  figure  marked  by  letter  “E”.  In  stud- 
ies with  an  objective  of  approximation  or  interpolation,  that  is  mathematical  model- 
ing, the  factor-variation  intervals  cover  the  whole  of  the  domain  of  factors.  For  a two- 
factor  experiment  the  upper  level  of  factors  Xj  and  X2  corresponds  to  values  Xlmax,- 
and  X2max,  while  the  lower  levels  have  values  Xlmin,  X2min.  Domain  of  factors  “O”  is 
in  that  case  called  interpolational,  and  “E”  the  domain  of  extreme  experiment. 
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Summary 

In  this  section  we  have  defined  a factor  as  a variable  that  may  affect  the  research 
subject.  For  a variable  to  be  a factor,  it  must,  besides  others,  fulfill  the  requirement 
that  it  is  controllable  and  singular. 

To  control  a factor  means  to  bring  it  to  a corresponding  value  or  level  and  keep  it 
constant  during  a design  point-trial,  or  to  change  it  by  a previously  set  up  program. 

This  is  exactly  the  special  property  of  an  active  or  designed  experiment.  Design  of  an 
experiment  is  possible  only  in  a case  when  a researcher  may,  according  to  his  own 
program,  assign  the  associated  values  or  levels  to  factors. 

Factors  should  directly  change  the  research  subject  state.  It  is  hard  to  control  a 
factor  that  is  a function  of  other  variables,  but  this  does  not  mean  that  in  a design  of 
experiments,  complex  factors,  such  as  logarithms,  similarity  criteria  etc.,  may  not  be 
used.  Besides  the  mentioned  requirements,  factors  should  be  concordant  and  linear- 
ly uncorrelated.  When  some  of  the  significant  factors  have  been  left  out  in  selection, 
a researcher  may  get  a wrong  optimum  or  a big  trial  error.  Factors  can  be  qualita- 
tive/categorical and  quantitative.  The  accuracy  of  fixing  a factor  should  be  high  and 
should  depend  on  the  factor  variation  range.  Selection  of  a factor  is  especially  impor- 
tant in  defining  a research  problem  and  the  result  of  experimental  research  greatly 
depends  on  it. 

2.1.4 

Measuring  Errors  of  Factors  and  Responses 

An  important  property  of  design  of  experiments  is  a search  for  increased  accuracy 
in  fixing  a factor  and  measuring  an  error.  The  researcher  must  be  able  to  determine 
and  estimate  a measurement  error  correctly.  Measurements  and  measurement 
errors  are  a subject  of  special  study,  see  [7,  8]. 

Measurement  should  not  be  brought  down  to  simply  determining  a measured 
value  but  also  to  estimating  errors  in  measurements,  called  the  measurement  error. 
There  are  several  kinds  of  errors  in  measurement:  robust,  systematic  and  random. 

Robust  errors  result  from  disrupting  basic  conditions  for  measuring,  researcher’s 
error,  etc.  A researcher  is  asked  to  check  the  probability  of  appearance  of  a robust 
error.  A robust  error  appears  as  a measured  value  that  is  drastically  different  from 
others.  This  error  may  be  avoided  if  another  researcher  who  is  ignorant  of  former 
measurements  repeats  it.  The  same  effect  may  be  achieved  when  the  same 
researcher  repeats  measurements  after  some  time  when  he  has  already  forgotten 
the  results  the  of  first  ones.  Such  a result  has  to  be  rejected  if  a robust  error  has 
been  discovered. 

Systematic  errors  appear  as  a result  of  the  activity  of  certain  factors  and  in  cases 
of  numerous  repetitions  of  the  same  measurement.  This  kind  of  error  occurs  when 
measuring  is  done  with  an  instrument  with  incorrect  calibration.  A systematic  error 
is  discovered  by  measurements  with  different  instruments  or  different  methods  of 
the  same  magnitude.  We  distinguish  among  several  kinds  of  systematic  errors: 
known  nature  and  unknown  magnitude  and  systematic  errors  of  unknown  origin. 
Systematic  errors  of  known  origin  and  magnitude  are  not  a problem  as  they  may  be 
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included  into  measurement  results  as  corrections.  The  problem  is  the  other  aspects 
of  systematic  errors,  error  theory,  which  is  based  on  theoretical  probability  laws.  As 
processing  the  results  of  designed  experiment  accepts  only  random  errors,  only  this 
kind  of  error  is  the  subject  of  analysis.  Random  measurement  errors  are  character- 
ized by  an  associated  distribution  law.  The  distribution  of  random  errors  is  mostly 
suited  to  the  normal  distribution  law  given  in  Sect.  1.1.3.  Normal  distribution  is 
defined  by  the  arithmetic  mean  of  random  value  X and  sample  variance  S2.  The  val- 
ue X is  the  most  probable  value  of  measured  property  and  is  calculated  by  the  well- 
known  formula: 

U 

X=J2Xi/u  (2.21) 

i— 1 

where: 

X;  are  measured  values; 
u is  number  of  repeated  measurements. 

Variance  value  or  variance  measurement  is  in  this  case  also  determined  by  the 
well-known  formula: 

Eft-*)2 

a2  « S2  = — (2.22) 

The  positive  value  from  the  variance  measurement  root  square  is  called  the  error 
mean  square  or  standard  error. 

o=S=+\/?  (2.23) 

When  estimating  measurement  results,  the  important  thing  is  not  only  to  know 
its  accuracy  but  also  the  measurement  confidence.  The  degree  of  measurement  the 
confidence  is  estimated  from  confidence  interval  as  defined  by  the  level  of  signifi- 
cance. Let  X denote  the  actual  measurement  value  and  AX  the  error  in  measuring 
the  mean  X,  then: 

P(X-  AX<X<X  + AX)l-a  (2.24) 

where  1-a  is  the  confidence  coefficient  or  the  probability  that  the  measurement 
result  is  within  the  confidence  interval  (2.24).  For  a 5%  level  of  significance,  the  con- 
fidence interval  limits  for  the  measurement  mean  may  by  determined  if  we  know 
the  measurement  variance  for  a corresponding  number  of  measurements: 

X = X ± 1.96 (2.25) 
v w 

This  indicates  that  to  know  the  measurement  random  error  it  is  not  sufficient  to 
know  its  magnitude  only  (confidence  interval  of  measurement  error)  but  also  the 
significance  level  that  facilitates  the  confidence  estimate  of  the  obtained  measure- 
ments. Using  the  error  mean  square  as  a measurement  accuracy  property  is  conve- 
nient because  that  value  in  a normal  distribution  is  associated  with  a confidence  or 
confidence  coefficient  of  0.68  probability.  The  doubled  error  mean  square  2S  has 
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0.95  confidence  and  3S  has  0.997  confidence  level.  To  know  the  enoi  mean  square 
indicates  a possibility  to  establish  the  measurement  confidence  interval  for  any  con- 
fidence coefficient.  Table  2.10  is  suitable  for  such  calculations,  which  for  the  asso- 
ciated confidence  1-a  contains  AX  values  expressed  as  error  mean  square  (0=AX/S), 

Table  2.10  Confidence  interval  coefficient 


0=AX/S  3.9  2.6  2.4  2.0  1.65  0.7  0.3  0.15  0.05 

1-a  0.9999  0.990  0.984  0.950  0.90  0.51  0.24  0.12  0.04 


Example  2.5 

100  measurements  (u=100)  were  done  for  an  unknown  property.  Using  expression 
(2.21)  the  mean  of  all  measurements  X=1.27  was  determined.  The  calculation  by 
expression  (2.22)  offered  the  measurement  error  mean  square  S=0.032.  Determine 
the  measurement  confidence  interval  for  confidence  coefficient  l-a=0.98. 

Table  2.10  gives  for  the  associated  confidence  that  0=2.4,  so  that: 

AX=9xS=2.4x0.032=0.08 

The  measurement  confidence  interval  is: 

X -0.08  <X<X  + 0.08 
1.19<X<1.35  or  X=l. 2710.08 

This  problem  may  set  up  another  problem.  What  is  the  probability  for  X=1.27  and 
S=0.032  results  of  individual  measurements  not  to  fall  outside  the  measurement 
confidence  interval  1.19<X<1.35? 

For  AX=0.08  we  obtain: 


0.08 

0.032 


= 2.4 


Probability  0.984  corresponds  to  the  value  0=2.4  in  Table  2.10.  Hence  98.4%  of  all 
individual  measurements  fall  within  their  confidence  interval.  The  confidence  inter- 
val of  individual  measurements  has  evidently  been  analyzed  so  far.  However,  in 
practice  it  is  very  important  to  know  the  deviation  of  the  arithmetic  measurement 
mean  from  the  actual  X value.  This  problem  was  more  generally  solved  in  Sect. 
1.3.2.  Hence,  the  value  AX  is  determined  as  follows: 


AX  = ± ^ (2.26) 

v w 

where: 

t is  Student’s  distribution,  Table  C; 

S is  error  mean  square  of  measurements; 
u is  number  of  measurements. 

We  know  that  with  an  increase  in  confidence  or  its  coefficient  the  t-value  also 
rises,  which  means  that  AX  also  goes  up  resulting  in  a decrease  of  accuracy  in  deter- 
mining X.  In  accord  with  Eq.  (2.26),  to  maintain  accuracy  in  measuring  X,  it  is  nec- 
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essary  to  reduce  the  error  mean  square  of  measurement  S or  to  increase  the  number 
of  measurements  u.  Eq.  (2.26)  has  this  form  when  we  consider  the  relation: 


x tx^  < x < x +tx^ 

\/u  — — s/u_ 


= 1 — a 


(2.27) 


Equation  (2.27)  is  used  to  determine  the  confidence  interval  or  its  limit  of  arith- 
metic measurement  mean,  to  the  actual  measurement  value  for  the  given  confi- 
dence coefficient  and  the  number  of  measurements. 


Example  2.6 

Determine  confidence  interval  limits  within  which  is  the  average  measurement  val- 
ue at  a=0.05.  Five  measurements  were  done  (u=5).  The  arithmetic  mean  is  X=31.2 
and  S=0.24.  From  Table  C for  a=0.05  and  f=u-l=5-l=4  we  obtain  t005=2.78  so  that: 


AX  = ± 278x024  = 


n/5 


31. 20-0. 30<X<31. 20+0.30 


X=31. 2010.30 

Hence  we  may  assert  with  0.95  confidence  that  the  actual  measurement  value  is 
between  30.90  and  31.50.  For  the  same  values  from  Example  2.6  we  may  ask  the 
following  question:  What  is  the  probability  or  confidence  that  average  measurement 
value  X=31.20  does  not  differ  from  its  actual  value  by  more  than  0.20?  By  using  Eq. 
(2.26)  we  get: 

A Xxy'ii  0.20xv/5 

t = = = 1.86 

S 0.24 

For  the  obtained  arithmetic  value  of  Student’s  criterion  t=1.86  and  for  f=u-l=4 
from  Table  C we  have  a=0.14  or  l-a=0.86.  Analogous  calculations  show  that  for  the 
same  error  mean  square  of  measurements  and  for  the  same  AX=0.20,  an  increase 
in  the  number  of  measurements  to  10  (u=10)  allows  an  increase  in  confidence  to 
0.97,  for: 

0.20x\/i0 
t = — — f — =2.6 
0.24 

Hence,  calculations  from  relation  (2.26)  facilitate  determining  the  necessary  num- 
ber of  measurements  (u).  Thereby,  it  is  of  course  necessary  to  previously  define  the 
size  of  the  random  value  that  may  be  accepted  and  the  coefficient  or  degree  of  mea- 
surement confidence.  In  practice,  we  are  satisfied  with  the  level  that  is  not  above 
0.5%.  Table  2.11  is  used  for  practical  determination  of  the  necessary  number  of  mea- 
surements, for  known  measurement  confidence  1-a  and  for  different  confidence 
interval  limits  expressed  by  the  error  mean  square  of  measurement  AX/S. 
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Table  2.11  Number  of  measurements 


AX/S 

Necessary  number  of  measurements  1-a 

0.90 

0.95 

0.99 

0.999 

1.0 

5 

7 

11 

17 

0.5 

13 

18 

31 

50 

0.4 

19 

27 

46 

74 

0.3 

32 

46 

78 

127 

0.2 

70 

99 

171 

277 

If  it  comes  out  that  to  reduce  random  error  it  is  necessary  to  increase  the  number 
of  measurements  drastically,  it  is  more  acceptable  to  try  to  find  a way  to  reduce  ran- 
dom error  by  increasing  measurement  accuracy  or  by  reducing  the  error  mean 
square  of  measurement  S.  This  may  be  achieved  by  changing  the  measurement 
method  or  using  more  up-to-date  equipment.  Knowledge  of  the  error  mean  square 
of  measurement  obtained  from  its  results  may  be  used  to  discover  robust  (extreme) 
measurement  values.  When  a researcher  thinks  that  a measurement  has  an  extreme 
value,  then  the  following  Student’s  t-criterion  value  is  calculated: 


t = 


X„—X 

E 

s 


(2.28) 


where: 

XE  is  extreme  measurement  value; 

X is  arithmetic  mean  of  other  measurements  but  without  extreme. 

The  calculated  t-criterion  value  is  then  compared  with  the  tabular  value  for  the 
associated  degree  of  freedom  and  significance  level.  When  the  calculated  value  is 
above  the  tabular,  it  means  that  the  extreme  measurement  value  is  a robust  error 
and  it  should  be  rejected. 


Example  2.7 

The  mean  X=6.500  from  u=41  was  obtained  in  measuring  a property.  The  associated 
error  mean  square  has  a S=0.133  value.  The  researcher  assumes  that  singular  mea- 
surement XE=6.866  is  a robust  error. 

Checking  shows  this: 

6.866-6.500  „ „„ 

t = = 2.75 

0.133 

From  Table  C we  obtain  tT=2.74  for  confidence  level  a=0.01  and  f=u-l=41-l=40. 
Since  tR=2.75>tT=2.74  it  confirms  that  the  researcher  was  right  and  that  the  analyzed 
measurement  should  be  dropped.  Note  that  the  same  procedure  of  rejection  of 
extreme  values  was  demonstrated  in  Sect.  1.5. 

When  doing  experimental  research,  one  should  distinguish  several  kinds  of 
errors:  measurement  error,  trial  error  and  experiment  error.  These  errors  will  be  ana- 
lyzed in  detail  in  a subsequent  chapter. 
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2.2 

Screening  Experiments 

2.2.1 

Preliminary  Ranking  of  the  Factors 

We  shall  here  consider  the  methods  that  are  applied  in  processing  reference  data 
and  which  simultaneously  serve  as  the  first  phase  of  experimental  research  in  cases 
when  from  the  total  number  of  factors  we  should  select  the  most  important  ones.  In 
this  phase  of  formalizing  the  preliminary  information,  it  is  very  useful  to  apply  a 
psychological  experiment.  This  experiment  is  a method  of  objective  processing  of  the 
data  obtained  from  either  researchers,  specialists  in  the  observed  field,  or  reference 
literature.  This  kind  of  experiment  facilitates  objective  knowledge  of  a research  sub- 
ject, accepting  or  rejecting  of  preliminary  stated  hypotheses,  objective  comparison  of 
effects  of  different  factors  on  system  response  and,  hence,  a correct  selection  of  fac- 
tors for  the  active  experiment  phase.  The  method  of  preliminary  ranking  of  the  factors, 
is  based  on  the  methods  of  rank  correlation  [9].  The  subject  of  this  method  is  that 
factors,  in  accord  with  preliminary  information,  are  ranked  according  to  the  order  of 
their  effects  on  the  response  system.  The  effect  of  each  factor  is  judged  by  the  rank- 
place,  each  researcher  has  given  to  it  (based  on  the  researcher’s  enquiry,  expert 
papers,  literature,  etc.)  in  ranking  all  the  factors  by  their  assumed  effect  (quantitative 
effect  unknown)  on  response.  When  gathering  information  from  each  researcher, 
he  is  required  to  fill  in  the  enquiry  on  the  order  of  effects  of  the  given  factors  on  a 
certain  response.  The  enquiry  includes  factors,  their  dimensions  and  assumed  varia- 
tion intervals.  The  researcher  fills  in  the  enquiry  by  defining  the  place  of  each  factor 
in  a ranking  order.  Each  enquired  researcher  may,  simultaneously,  supplement  the 
enquiry  by  new  facts  and  suggested  variation  intervals.  The  enquiry  results  or  rank- 
ing by  reference  data  is  processed  in  this  way. 

m 

First  sums  of  ranks  by  factors  (ff  af,  then  differences  (Ai)  between  sums  of 
ranks  for  each  factor  and  average  surhs  of  ranks  and  sums  of  squares  deviations  (S) 

are  determined: 

k m 

m Q'ij  m 

Ai  = E«9-J4— = E«9-T  (2-29) 

1 K 1 

m 

S=E(Ai)2  (2.30) 

l 

where: 

ajj  is  rank  of  factor  i with  researcher  j, 
m is  number  of  researchers, 
k is  number  of  factors, 

T is  average  sum  of  ranks. 
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The  obtained  results  facilitate  constructing  the  rank  graph.  However,  prior  to  that, 
one  should  determine  the  degree  of  opinion  concordance  of  all  the  researchers  by 
the  concordance  coefficient: 


12  xS 


where: 


m 

m2  (jc3  —kj  Tj 


TJ  = 


e (>i  - >j) 


(2.31) 


tj  is  number  of  equal  ranks  in  ranking  j. 

Before  drawing  a conclusion  based  on  the  concordance  coefficient,  it  is  necessary 
to  test  its  significance.  Thereby  one  should  keep  in  mind  that  value  m(k-l)ai  has  an 
%2  distribution  with  f=k-l  degrees  of  freedom.  The  arithmetic  value  of  the  y2  criter- 
ion is  calculated  by  the  formula: 


2 

X = 


12xS 


mk(k+ 1) — T. 

k-t  1 J 


(2.32) 


The  hypothesis  on  concordance  of  the  researchers’  opinions  is  accepted  if  by  the 
given  degrees  of  freedom  and  a significance  level  the  calculated  value  y2  is  above 
the  tabular  one  [10]. 

Starting  with  concordance  of  the  researchers’  opinions,  the  rank  graph  is  con- 
structed by  inserting  factors  on  the  abscissa  and  the  associated  sums  of  ranks  on  the 
ordinate,  but  in  the  opposite  direction.  Due  to  the  direction  of  inserting  the  sums  of 
ranks,  the  larger  rectangle  above  the  abscissa  corresponds  to  a smaller  sum  of  ranks. 
Depending  on  the  shape  of  the  rank  curve,  which  connects  the  histogram  rectangles, 
different  decisions  may  be  made. 

• When  the  rank  curve  has  the  shape  as  in  Fig.  2.15,  factors  from  the  first  half 
of  the  graph  enter  the  basic  experiment; 


Figure  2.15  Rank  graph 
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• When  the  rank  curve  is  continually  decreasing  as  in  Figs.  2.16  and  2.17,  all 
factors  are  included  into  the  next  active  experiment  for  selecting  factors. 


Figure  2.16  Rank  graph  Figure  2.17  Rank  graph 


Example  2.8  [11] 

A composite  rocket  propellant  is  mixed  up  in  a vertical  planetary  mixer.  The  viscosity 
of  the  mixed  propellant  depends  on  eight  factors  according  to  the  reference  litera- 
ture. Those  factors  are:  X3  mixing  temperature;  X2  time  of  mixing  after  addition  of 
the  third  portion  of  ammonium  perchlorate;  X3  mixing  rate;  X4  mixture  mass;  X5 
mixing  time  after  addition  of  the  first  portion  of  ammonium  perchlorate;  X6  mixing 
time  after  addition  of  the  second  portion  of  ammonium  perchlorate;  X7  mixing  time 
of  premix  and  Xg  vacuum  in  the  mixer.  The  outcomes  of  the  inquiry  by  eight 
researchers  are  presented  in  Table  2.12.  Complete  factor  selection  based  on  the  sig- 
nificance of  their  effect  on  dynamic  viscosity,  by  applying  the  method  of  prior  rank- 
ing of  the  factors. 

The  following  values  may  be  taken  from  Table  2.12: 

n 

£ Tj  = 120;  T = 36;  S = 2019.50;  so  that: 

l 


12x2019.50 

82(V— 8^-8x120 


0.77 


Since  the  concordance  coefficient  values  are  significantly  different  from  zero,  one 
may  say  that  there  is  concordance  among  the  opinions  of  eight  researchers.  To  be 
sure,  the  significance  of  the  concordance  coefficient  was  checked  by  Eq.  (2.32). 


2 

X 


12x2019.50 

8x8(8+11 — —xl20 
8-1 


43.36;/  = 8-  1 = 7. 
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Table  2.12  Results  of  ranking  factors 


Researchers 

m 

aii 

;-i  x 7 

X, 

x2 

x3 

X4 

x5 

x6 

x7 

x8 

i 

1 

2 

6.5 

6.5 

4 

4 

4 

8 

24+6=30 

2 

1 

2 

3.5 

3.5 

7.5 

7.5 

5.5 

5.5 

6+6+6=18 

3 

1 

2.5 

2.5 

5 

5 

5 

7.5 

7.5 

6+6+6=18 

4 

2.5 

1 

2.5 

4 

5.5 

5.5 

7 

8 

6+6=12 

5 

1 

2.5 

2.5 

4 

5.5 

7 

8 

8 

6+6=12 

6 

1 

3 

2 

4 

5 

6 

7 

8 

0 

7 

1 

2 

3 

4 

5 

6 

7 

8 

0 

8 

1 

2.5 

4.5 

6 

2.5 

4.5 

8 

7 

0 

m 

9.5 

19.5 

26.0 

36.0 

40.0 

44.0 

53.0 

60.0 

m 

£ Tj  = 120 

1 

1 

A; 

-26.5 

-16.5 

-10.0 

0.0 

4.0 

8.0 

17.0 

24.0 

k 

(A;)2 

702.25  272.25100.00 

0.0 

16.0 

64.0 

289.0 

576.0 

(Aj)2  = 2019.50 

1 

From  Table  D for  a=0.05  and  degrees  of  freedom  f=7,  the  tabular  value  is 
X7.95%=14.1.  Since  the  calculated  value  of,  %2  criterion  is  above  the  tabular  value,  the 
hypothesis  on  concordance  of  the  researchers’  opinions  is  accepted.  This  conclusion 
allows  us  to  construct  the  rank  graph  in  Fig.  2.18.  The  rank  graph  clearly  shows  an 
equal  distribution,  so  these  eight  factors  should  be  included  in  the  active  experiment 
for  factor  selection. 


Figure  2.18  Rank  graph 


Figure  2.19  Rank  histogram 
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Example  2.9  [12] 

A system  consisting  of  a column  and  a cooler,  produces  a material  characterized  by 
density  as  a system  response.  The  density  of  the  observed  product  is  affected  by  six 
factors:  X3  chlorine  consumption;  X2  water  consumption  in  the  column;  X3  phleg- 
matizer  consumption;  X4  temperature  in  column;  X5  level  of  liquid  in  column  and 
X6  water  consumption  in  cooler.  The  opinions  of  four  researchers  are  given  in 
Table  2.13.  Check  the  concordance  of  the  researchers’  opinions. 

Table  2.13  Results  of  ranking  factors 


Researchers 

m 

aii 

= £(**-%) 

x. 

X2 

X3 

X4 

x5 

x6 

1 

1.5 

5 

1.5 

4 

3 

6 

23-2=6 

2 

2 

3 

1 

4.5 

4.5 

6 

23-2=6 

3 

2 

3 

1 

5.5 

5.5 

4 

23-2=6 

4 

1.5 

3.5 

1.5 

5 

3.5 

6 

(23-2+23-2)=12 

m 

E«i,' 

7 

14.5 

5 

19 

16.5 

22 

£Tp30 

A; 

-7 

0.5 

-9 

5 

2.5 

8 

6 

(A;)2 

49 

0.25 

81 

25 

6.25 

64 

E(A;)2  = 225.5 
1 

12x225.5 


4"  (63 -6  1—4x30 


0.805; 


2 

Xr  = 


12x225.5 

4x6(6+l) — — x30 

6—1 


16.1 


2 

Since  the  tabular  value  is  xT=11.07  for  a=0.05  and  f =6-1=5,  the  hypothesis  on  the 
concordance  of  the  researchers’  opinions  is  accepted.  A histogram  of  sums  of  ranks 
is  shown  in  Fig.  2.19. 

The  rank  histogram  shows  that  the  sum  of  ranks  does  not  change  evenly,  so  that 
we  can  accept  the  solution  to  include  the  following  four  factors  into  the  basic  design 
of  experiment:  X3,  X3,  X2,  and  X5.  A more  cautious  approach  to  drawing  conclusions 
suggests  a more  detailed  check  of  all  six  factors  in  an  active  experiment  for  screen- 
ing factors  such  as,  the  method  of  random  balance. 


Problem  2.1 

A study  about  the  effect  of  twelve  factors  of  material  preparation 
with  regard  to  fiber  tensile  strength  was  done  in  the  textile  industry. 
Four  researchers  were  asked  about  the  prior  ranking  of  the  factors. 
The  rank  matrix  obtained  on  the  basis  of  a research  poll  is  shown  in 
Table  2.14.  Check  the  concordance  of  researchers’  opinions  and 
choose  the  significant  factors  for  the  next  step  of  experimental 
research. 
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Table  2.14  Results  of  ranking  of  the  factors 


Researchers 

m 

aii 

II 

P 

X, 

X2 

x3 

x4 

x5 

x6 

x7 

Xg 

X9 

Xio 

Xn 

x12 

i 

8 

10.5 

10.5 

10.5 

1 

2.5 

2.5 

10.5 

5 

4 

7 

6 

60+6=66 

2 

8 

9 

10 

11 

1 

6.5 

6.5 

12 

2 

3 

4 

5 

8-2=6 

3 

6 

7.5 

7.5 

11 

2 

4.5 

4.5 

12 

1 

3 

9.5 

9.5 

6+6+6=18 

4 

7 

4 

8 

10.5 

2 

10.5 

10.5 

10.5 

1 

3 

5.5 

5.5 

60+6=66 

m 

£®,j 

i 

Ai 

29 

31 

36 

43 

6 

24 

24 

45 

9 

13 

26 

26 

-M* 

jri 

ii 

U1 

C5 

3 

5 

10 

17 

-20 

-2 

-2 

19 

-17 

-13 

0 

0 

(A;)2 

9 

25 

100 

289 

400 

4 

4 

361 

289 

169 

0 

0 

E (A;)2  = 1650 
1 

Problem  2.2 

In  a process  of  chlorinating  titanium,  seven  technological  factors 
were  suggested  to  be  analyzed.  Five  researchers  were  asked  prior  to 
ranking  of  the  factors.  The  rank  matrix  is  shown  in  Table  2.15. 
Determine  the  concordance  coefficient  for  the  researchers’  opi- 
nions. 


Table  2.15  Results  of  ranking  of  factors 


Researchers 

m 

T| 

X, 

x2 

x3 

x4 

x5 

x6 

x7 

i 

1 

2 

6 

4 

7 

3 

5 

2 

1 

2 

7 

6 

3 

5 

4 

3 

7 

1 

6 

4 

2 

5 

3 

4 

3 

1 

5 

6 

4 

7 

2 

5 

1 

2 

6 

4 

5 

7 

3 

m 

£®5 

\ 

13 

8 

30 

24 

21 

27 

17  ETj=0 

A 

-7 

-12 

10 

4 

1 

7 

-3 

(Ai)2 

49 

144 

100 

16 

1 

49 

9 S=368 
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Problem  2.3  [16] 

The  selection  of  factors  from  the  work  [16]  was  done  by  the  significance  of  their 
effects  on  caoutchouc  drying  and  after  processing  the  opinions  of  eighteen 
researchers.  The  analysis  included  these  eleven  factors::  Xi  inlet  and  outlet 
moisture  ratio;  X2  pH  value  in  the  sixth  apparatus;  X3  pH  value  in  the  seventh 
apparatus;  X4  NaCl  consumption;  Xs  serum  consumption  in  the  sixth  appara- 
tus; Xs  serum  consumption  in  the  seventh  apparatus;  X7  serum  temperature; 

Xs  latex  type;  X9  fat  content  in  caoutchouc;  X10  latex  fat  consumption  on 
machine  and  X11  quantity  of  latex  on  the  surface.  Outcomes  of  the  ranking  of 
the  factors  are  shown  in  Table  2.16.  Determine  the  concordance  coefficient  and 
check  its  significance. 

Table  2.16  Results  of  ranking  of  factors 


Researchers 

m 

ai| 

‘i 

x. 

X2 

x, 

x4 

x5 

x6 

x7 

x8 

X9 

X10 

Xn 

1 

2 

5.5 

5.5 

5.5 

8.5 

8.5 

10.5 

2 

2 

5.5 

10.5 

3+4+2+2 

2 

1 

8 

3 

6 

11 

10 

7 

2 

9 

5 

4 

0 

3 

1 

4 

7 

8 

6 

10 

11 

3 

5 

2 

9 

0 

4 

3 

9 

1 

4 

9 

9 

5 

9 

6 

2 

9 

5 

5 

9 

11 

1.5 

6 

7.5 

7.5 

3 

10 

4 

1.5 

5 

2+2 

6 

2.5 

2.5 

8.5 

6 

6 

8.5 

10.5 

2.5 

6 

2.5 

10.5 

4+2+3+2 

7 

2 

2 

6 

6 

6 

9 

10.5 

2 

6 

6 

10.5 

3+5+2 

8 

1 

2.5 

6 

6 

6 

6 

10 

2.5 

6 

10 

10 

2+5+3 

9 

1.5 

5.5 

9 

5.5 

9 

9 

9 

3.5 

9 

1.5 

3.5 

2+2+2+5 

10 

1 

7 

7 

2 

7 

7 

7 

3.5 

3.5 

10 

11 

2+5 

11 

2 

5.5 

5.5 

7 

9 

9 

9 

3 

4 

1 

11 

2+3 

12 

5.5 

2.5 

10 

8 

5.5 

11 

2.5 

2.5 

8 

2.5 

8 

4+2+3 

13 

5 

2.5 

9.5 

7.5 

9.5 

11 

2.5 

1 

5 

5 

7.5 

3+2+2+2 

14 

5 

3.5 

6.5 

3.5 

9 

10 

8 

1.5 

6.5 

1.5 

11 

2+2+2 

15 

2 

1 

9 

5 

7 

8 

10 

3 

4 

6 

11 

0 

16 

1.5 

7 

4 

7 

10 

10 

10 

3 

5 

1.5 

7 

2+3+3 

17 

1 

4 

10 

9 

7 

8 

6 

2 

5 

3 

11 

0 

18 

5.5 

5.5 

2 

3.5 

8.5 

8.5 

8.5 

1 

3.5 

8.5 

11 

2+2+4 

18 

£«,}• 

1 

51.5 

98.5 

111 

105.5 

141.5 

160 

140.5 

57 

97.5 

75.5 

160.5 

18 

£ Tj  = 946 
1 
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Problem  2.4  [14] 

Sixteen  factors  were  defined  in  the  process  of  refining  petroleum 
oils.  By  the  method  of  prior  ranking,  determine  the  concordance 
coefficient  and  check  its  significance.  Concordance  of  data  from 
reference  literature  is  checked  based  on  ranking  outcomes  shown  in 
Table  2.17. 


Table  2.17  Results  of  ranking  of  factors 


R. 

aij 

m 

X, 

X2 

x3 

x4 

x5 

x6 

x7 

x8 

X9 

Xio 

Xn 

X]2 

Xn3 

X14 

*15 

Xi6 

i 

1 

2 

1 

1 

2 

1 

1 

2 

2 

4 

3 

2 

4 

5 

5 

3 

2 

1 

4 

1 

1 

3 

1 

1 

2 

2 

4 

3 

2 

4 

4 

5 

3 

3 

1 

2 

5 

1 

3 

1 

5 

4 

4 

4 

3 

4 

3 

3 

1 

4 

4 

2 

2 

2 

2 

3 

2 

1 

2 

4 

2 

3 

3 

2 

2 

3 

4 

5 

2 

3 

2 

2 

2 

2 

1 

2 

3 

2 

1 

3 

4 

3 

3 

4 

6 

2 

2 

2 

1 

2 

1 

3 

2 

2 

2 

3 

2 

2 

2 

2 

4 

7 

3 

3 

2 

2 

3 

2 

1 

3 

2 

4 

2 

3 

3 

2 

4 

3 

8 

3 

1 

4 

3 

2 

2 

2 

3 

3 

3 

4 

3 

3 

3 

4 

3 

9 

1 

2 

4 

4 

2 

4 

4 

2 

3 

3 

3 

4 

3 

4 

4 

4 

10 

2 

2 

4 

4 

3 

2 

3 

2 

2 

4 

2 

4 

5 

4 

3 

5 

11 

1 

3 

2 

2 

4 

2 

5 

4 

4 

2 

4 

3 

2 

4 

3 

3 

12 

2 

1 

5 

4 

2 

5 

4 

3 

2 

3 

3 

5 

5 

3 

3 

4 

13 

3 

1 

5 

4 

3 

5 

4 

3 

4 

2 

2 

5 

3 

3 

5 

4 

14 

2 

2 

5 

2 

2 

2 

1 

3 

4 

2 

4 

5 

3 

4 

3 

4 

15 

2 

2 

5 

3 

3 

3 

3 

5 

2 

2 

3 

4 

4 

3 

4 

3 

16 

1 

1 

4 

2 

2 

4 

3 

2 

3 

3 

3 

4 

3 

3 

3 

3 

17 

3 

1 

4 

3 

2 

5 

3 

2 

4 

3 

3 

3 

2 

3 

3 

3 

17 

1 

32 

34 

37 

39 

42 

44 

45 

46 

48 

50 

51 

62 

55 

56 

58 

63  2Tj=615 

A; 

16 

14 

11 

9 

6 

4 

3 

2 

0 

2 

3 

14 

7 

8 

10 

15 

(Ai)2 

256 

196 

121 

81 

36 

16 

9 

4 

0 

4 

9 

196 

49 

64 

100  225  S=1366 

2.2.2 

Active  Screening  Experiment-Method  of  Random  Balance 

The  application  of  screening  experiments  is  obligatory  when  operating  with  a rela- 
tively large  number  of  factors  (k>7),  because  in  the  first  phase,  it  facilitates  the  inclu- 
sion of  all  those  factors  that  do  not  affect  the  response  greatly.  Thus,  they  also  con- 
siderably simplify  the  research  of  the  factor  space-domain  and  the  modeling  of  the 
response  surface.  An  active  selective  method,  which  may  be  applied  in  solving  this 
problem  is  the  analysis  of  variance. 

Analysis  of  variance,  as  has  been  said  in  Sect.  1.5,  is  based  on  the  fact  that  the 
significant  effects  of  certain  factors  depends  on  their  contribution  to  the  response 
variance.  The  analysis  of  variance  is  in  practice  less  frequently  used  in  cases  of  a 
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large  number  of  factors  for  it  requires  a relatively  large  number  of  design  points- 
trials. 

The  efficiency  of  active  screening  experiments  may  be  greatly  improved  by  using 
the  method  of  random  balance.  This  method  facilitates  a relatively  simple  selection  of 
significant  factors  above  the  noise  level  or  the  level  of  random  response  variation,  in  a 
limited  number  of  design  points.  The  method  is  based  on  the  same  principle  as 
analysis  of  variance,  i.e.  on  the  fact  that  the  significance  of  certain  factor  effects 
depends  of  their  contribution  to  the  response  variance  [15].  The  selection  of  factors 
by  significance  of  their  effect  on  response  is  realized  by  the  method  of  random  bal- 
ance in  two  phases:  in  the  first  phase  the  matrix  of  design  of  experiment  is  defined, 
the  experiment  takes  place  and  based  on  its  results  scatter  diagrams  are  constructed. 
In  the  second  phase,  significant  factors  are  taken  from  scatter  diagrams,  and  their 
selection  is  proved  by  calculations  familiar  from  analysis  of  variance  [15]. 

Constructing  the  design  of  experiments  matrix  is  preceded  by  coding  the  factors, 
the  selection  of  variation  levels  and  by  determining  the  experiment  center.  When 
choosing  factor-variation  levels  one  should  take  care  for  them  to  be  upper  and  lower 
limits  of  normal  production  or  factor  variation  limit  values  in  the  domain  of  the 
actor.  Factor  coding,  choice  of  variation  levels  and  determining  the  center  of  the 
experiment  are  done  by  the  rules  valid  for  all  methods  of  design  of  experiments.  By 
the  random  balance  method,  factors  are  mostly  varied  on  two  levels  (+1;-1),  although 
varying  on  more  than  two  levels  is  possible.  The  number  of  design  points  in  a 
matrix  is  defined  so  as  to  be  divisible  by  two.  This  property  simplifies  calculations 
and  facilitates  estimating  linear  effects  in  all  cases. 

A design  matrix  by  the  method  of  random  balance  may  be  constructed  in  two  ways: 

• by  random  distribution  of  variation  levels  (+;-)  by  columns  with  the  help  of 
the  random  numbers  table; 

• by  random  mixing  of  rows  of  regular  fractional  replicas  of  factorial  experi- 
ments. 

The  second  method  of  constructing  a design  matrix  is  much  more  widespread, 
while  the  first  method  of  pure  random  balance  is  less  efficient  and  used  less  fre- 
quently. When  using  fractional  replicas  one  can  profit  from  semireplicas  from  a full 
factorial  experiment.  A semireplica  may  be  used  for  one  half  of  the  factors  directly, 
while  for  the  other  half,  levels  are  determined  by  a random  choice  of  rows  from  the 
same  semi  replica.  The  factors  in  a random  balance  matrix  are  distributed  in  such  a 
way  that  significant  factors  in  concordance  with  prior  information  or  the  method  of 
prior  ranking  of  factors,  are  in  the  first  part  of  the  matrix.  This  principle  may  in 
some  cases  reduce  the  number  of  design  points  in  the  next  phases,  especially  if  after 
applying  the  method  of  random  balance  we  try  to  find  the  optimum.  After  construct- 
ing the  matrix,  a check  should  be  done  in  this  way: 

• the  matrix  is  correct  provided  there  are  no  correlated  columns  in  it,  i.e.  the 
marks  in  two  different  columns  coincide  or  do  not  coincide; 

• a matrix  must  not  have  columns  whose  scalar  product  with  any  other  column 
gives  a column  with  the  same  signs. 
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Processing  the  obtained  results  is  done  after  performing  the  experiment  by  the 
constructed  matrix.  As  it  has  been  said,  the  outcomes  are  analyzed  by  scatter  dia- 
grams for  each  factor.  Outcomes  of  the  design  point  for  each  factor  on  the  diagram 
are  inserted  into  upper  (+)  and  lower  (-)  factor  levels  for  all  design  points  of  the 
experiment.  The  scatter  diagram  number  corresponds  to  the  number  of  factors. 

Each  factor  effect  is  considered  independently  from  others  and  is  at  first  determined 
visually,  based  on  the  differences  of  the  response  means  in  the  upper  and  lower  lev- 
els. The  median  is  taken  as  the  response  mean  at  the  associated  level.  Besides,  based 
on  the  number  of  so-called  division  points  in  the  upper  and  lower  levels  of  the  scatter 
diagram,  one  may  judge  the  significance  of  factor  effects  on  response.  The  existence 
of  a large  enough  number  of  division  points  is  a strong  reason  to  separate  the  signif- 
icant factor,  even  in  cases  when  the  difference  between  median  is  small  [15]. 
Visually  selected  effects  are  estimated  quantitatively  by  means  of  auxiliary  tables  with 
several  inputs.  In  principle,  those  are  the  tables  with  two  or  three  inputs,  so  that  all 
the  necessary  mark  combinations  usually  are  within  the  random  balance  matrix. 
Construction  of  auxiliary  tables  and  the  calculation  of  associated  effects  are  identical 
to  the  analysis  of  variance.  By  dividing  the  obtained  effects  by  two,  the  associated 
regression  factor  coefficients  (b;)  are  estimated.  The  significance  of  obtained  effects 
is  checked  by  the  Students  t-criterion  along  with  the  corresponding  threshold  or  sig- 
nificance level.  Having  selected  significant  factors  we  correct  the  initial  design  point 
results  so  as  to  annul  the  chosen  effects.  The  response  value  with  a changed  sign  is 
added  to  all  response  values  that  correspond  to  (+)  level  of  the  chosen  effect.  This,  in 
fact,  is  good  for  annulling  median  differences  for  the  levels  of  the  observed  factor. 
Following  this  correction,  the  procedure  is  repeated  by  including  factor  interactions. 

The  moment  to  stop  screening  factors  depends  on  the  Fisher  criterion: 

F = Sr/ Sy  (2.33) 

where: 

• SR  is  value  variance  of  all  design  points; 

• Sy  is  variance  of  system  reproducibility  as  determined  by  outcomes  of  repli- 
cated experiment  design  points-trials. 

Variance  Sr  is  determined  after  each  result  correction,  and  then  the  associated 
value  of  Fisher’s  criterion  is  calculated  and  compared  to  the  tabular  value  for  the  cho- 
sen significance  level.  Selection  of  a factor  is  stopped  at  the  moment  when  the  var- 
iance of  the  corrected  response  values  is  statistically  much  smaller  than  the  repro- 
ducibility system  variance.  The  following  examples  will  demonstrate  construction  of 
a design  matrix  for  the  method  of  random  balance,  by  random  mixing  of  kinds  of 
regular  fractional  replicas  or  of  a full  factorial  experiment.  The  procedure  consists  of 
breaking  all  factors  into  groups  that  function  as  research  subject  physics  or  as 
results  of  prior  ranking  of  the  factors.  Groups  should  thereby  consist  of  not  more 
than  five  to  six  factors.  Design  matrix  of  full  factorial  experiment or  fractional  factor- 
ial experiment 2)  corresponds  to  each  group,  with  types  of  a random  order.  A better 


1)  Full  factorial  experiment  FUFE 


2)  Fractional  factorial  experiment  FRFE 
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solution  is  the  grouping  with  three-four  factors  and  the  correspondence  of  the 
FUFE  matrix. 

Example  2.10  [1 7] 

Filtration  conditions  were  studied  in  a process  of  producing  dyes.  The  filtration  phe- 
nomenon is  expressed  with  these  four  factors: 

Xj  concentration  of  a solution  being  filtered; 

(+)  upper  level-,  concentrated  solution; 

(-)  lower  level,  diluted  solution; 

X2  time  of  dosing  solution; 

(+)  upper  level,  fresh  solution; 

(-)  lower  level,  old  solution; 

X3  contents  of  filler  in  solution; 

(+)  upper  level,  filler  present; 

(-)  lower  level,  filler  not  present; 

X4  temperature  of  solution; 

(+)  upper  level,  higher  temperature; 

(-)  lower  level,  lower  temperature. 

Following  the  procedure  of  the  matrix  construction,  the  factors  are  divided  into 
two  groups  of  two  factors  each.  The  first  group  consists  of  X3  and  X2,  and  the  other 
one  of  X3  and  X4  factors.  A FUFE  matrix  has  four  design  points  for  two  factors  and 
is  shown  in  Table  2.18. 


Table2.18  FUFE  Table2.19  Matrix  of  random  balance 


Number 

Factors 

Number 

Factors 

Response 

D.P. 

X2 

D.P. 

x. 

X2 

x3 

x4 

Y 

1 

- 

- 

1 

+ 

- 

- 

+ 

114 

2 

+ 

- 

2 

- 

- 

+ 

+ 

106 

3 

- 

+ 

3 

- 

+ 

+ 

- 

120 

4 

+ 

+ 

4 

+ 

+ 

- 

- 

132 

By  random  choice  from  matrix  FUFE  or  Table  2.18,  we  take  for  the  first  group  of 
2.,  1.,  3.  and  4.,  row  respectively.  For  the  second  group  of  factors  we  take  in  the  same 
way  3.,  4.,  2.  and  1.  row  from  the  FUFE  matrix.  In  this  way,  the  design  matrix  for 
the  random  balance  method  is  constructed.  In  this  example,  the  response  is  repre- 
sented by  the  product  purity  y. 

The  values  for  product  purity  shown  in  Table  2.19  were  obtained  after  the  experi- 
ment. Scatter  diagrams  are  drawn  from  the  response  value  for  each  factor.  The  scat- 
ter diagram  for  all  four  factors  is  shown  in  Fig.  2.20. 

The  median  is  determined  on  this  diagram  for  each  factor  level.  When  the  num- 
ber of  points  in  one  level  is  odd  2n+l,  then  the  median  is  marked  in  the  point  n+1. 
If  the  number  of  points  is  even  2n,  the  median  passes  through  the  point  that  is  the 
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Figure  2.20  Scatter  diagram 
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arithmetic  average  between  n and  n+1  points.  The  difference  between  the  upper 
and  lower  level  medians,  gives  the  effect  of  the  associated  factor: 

EXi  = 123  - 113  = +10 

Ex2  = 126  - 110  = +16  (2.34) 

Ex  = 120-  114  = +6 
Ex4  = 114  - 120  = -6 

Due  to  the  calculated  effects,  the  significant  factors  after  the  first  screening  are  X1 
and  X2. 

These  factors  are  then  quantitatively  checked  by  forming  an  auxiliary  table  with 
two  inputs: 

Table  2.20  Table  with  two  inputs 


+X] 

■X, 

+X2 

132 

120 

Yi  = 132 

y2  = 120 

-X2 

114 

106 

h = 114 

y4  = 106 

These  effects  of  selected  factors  are  obtained  from  Table  2.20: 
EXl  = (+*i)  - (-*i)  = = +10 

Ex2  = (+X2)  - (-X2)  = ^ = +16 


(2.35) 
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To  obtain  an  answer  to  the  question  whether  other  factors  are  significant  too,  it  is 
necessary  to  exclude  effects  of  selected  factors  and  then  repeat  the  procedure.  The 
effects  of  selected  factors  are  analyzed  in  such  a way  that  in  each  design  point  where 
X3  is  in  the  upper  level  of  (+)  response  value  factor  effect  X]  is  added  with  a changed 
sign  ( EXi  =-10),  and  where  X2  is  in  the  upper  level  of  (+)  response  value  ex2  is  added 
with  a changed  sign  (£^=-16).  In  this  way,  Table  2.21  is  obtained  with  corrected 
responses. 

Table  2.21  Design  matrix  with  corrected  responses 


Number 
d.p.  _ 

Factors 

Responses 

x, 

X2 

x3 

X4 

y 

y' 

1 

+ 

- 

- 

+ 

114 

104 

2 

- 

- 

+ 

+ 

106 

106 

3 

- 

+ 

+ 

- 

120 

104 

4 

+ 

+ 

- 

- 

132 

106 

If  we  now  reconstruct  the  scatter  diagram  we  will  see  that  effects  for  factors  X3 
and  X4  are  irrelevant  and  close  to  zero.  The  conclusion  is  that  factors  X3  and  X4  are 
insignificant.  Besides  selecting  factors  by  their  significance,  it  is  necessary  to  single 
out  the  interaction  effects.  Signs  for  associated  interactions  are  obtained  by  simply 
multiplying  the  signs  of  associated  factors,  as  shown  in  Table  2.22. 

Table  2.22  Design  matrix  with  interactions 


Number 

trials 

Factors 

Interactions 

Response 

y 

x, 

X2 

x3 

X4 

x,x2 

X,X3  X,X4  X2X4  X2X3 

x3x4 

1 

+ 

- 

- 

+ 

- 

- + - 

+ 

114 

2 

- 

- 

+ 

+ 

+ 

- 

- 

+ 106 

3 

- 

+ 

+ 

- 

- 

- + 

+ 

120 

4 

+ 

+ 

- 

- 

+ 

- 

- 

+ 132 

Only  even  interactions  are  analyzed,  since  higher-order  interactions  are  less  prob- 
able. Drawing  scatter  diagrams  for  all  even  interactions  is  a troublesome  job.  Scatter 
diagrams  are  therefore  constructed  for  those  even  interactions  for  which  there  is  an 
indication  that  they  may  be  crucial.  An  indication  is  that  higher  effects  are  obtained 
from  interactions  where  point  dispersion  of  factors  in  upper  and  lower  levels  is 
reversed.  Scatter  diagram  of  even  interactions  is  obtained  in  the  same  way  as  for  the 
factors  and  is  shown  in  Fig.  2.21. 

Interaction  effects  X3X3,  X2X3  and  X2X4  are  singled  out  from  the  scatter  diagram. 
An  auxiliary  table,  Table  2.23,  with  three  inputs  is  formed  for  the  given  interactions. 
Vacancies  in  this  table  indicate  that  interaction  columns  X3X3  and  X2X4  are  intercor- 
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Figure  2.21  Scatter  diagram 

related.  This  is  also  proved  by  the  equality  of  signs,  which  does  not  allow  actual  cal- 
culation of  the  effects.  Table  2.24  with  two  inputs  is  therefore  formed  with  interac- 
tion columns  X4X3  and  X2X4  left  vacant.  In  a larger  number  of  design  points  and  for 
more  factors,  column  intercorrelation  is  hardly  probable.  Finally,  we  may  conclude 
that  by  the  method  of  random  balance  these  significant  factors  and  interactions 
were  singled  out: 

• factor  X3  with  effect  EXi=+10; 

• factor  X2  with  effect  Ex^+16; 

• even  interaction  X2X4; 

• even  interaction  X3X3. 

Table  2.23  Table  with  three  inputs 


« 

— 

13; 

1 

13; 

4 

11! 

i 

12( 

12( 

11! 

i 

n: 

r 

n : 

r 

nr 

IF 

10f 

lot 

■ 

-X-,  x2+  -x-,x3  + -x1x4+  -x2x4+  -x2x3  + -x3x4+ 


+X,x3 

-X1X3 

+X2X4  -X2X4 

+X2X4  -X2X4 

+X,X,  

114 

120 

132 

Vi  = 132 

y4  = H7 

-x,x3  

106 

Ys  = 106 
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Table  2.24  Table  with  two  inputs 


+X2X4 

-X2X4 

tXjX, 

132 

-X,x, 

Vi  = 132 

114 

106 

120 

y4  = 113 

Interaction  effects  X4X3  and  X2X4  could  not  be  singled  out  due  to  column  correla- 
tion, so  they  remained  mixed. 

This  example,  too,  has  shown  the  advantage  of  constructing  the  design  matrix  for 
the  method  of  random  balance  from  FUFE  and  FRFE,  since  for  the  two  obtained 
factors  X3  and  X2  we  already  have  a finished  experiment  by  a full  factorial  design. 
These  results  are  for  all  four  design  points  of  the  design  in  Table  2.19.  Since  the 
method  of  random  balance  also  gives  the  effects,  they  may  be  divided  by  two  to 
obtain  regression  coefficients  for  a robust  approximation  of  response: 

y = y + 10/2  x X1  + 16/2  x X2  = (114  + 106  + 120  + 132)/4  + 5X3  + 8X2 

y = 118  + 5Xx  + 8X2  ; (-1  < X1  < 1 ; —1  < X2  < l)  (2.36) 

The  following  could  be  said,  after  all  things  disclosed,  about  the  active  method  of 
screening  factors  or  the  method  of  random  balance: 

• the  method  of  random  balance  is  effective  in  complicated,  unclear  situations 
when  fast  screening  of  factors  is  necessary;  the  method  is  the  more  effective 
the  fewer  significance  factors  exist,  since  an  increase  in  their  number 
increases  the  design-point  variance; 

• the  method  of  random  balance  is  more  of  a new  approach  to  an  experiment 
then  the  processing  of  experiment  data; 

• even  untrained  researchers  may  profit  from  the  method,  even  in  situations 
with  a large  number  of  factors; 

• the  method  facilitates  singling  out  significant  factors  with  sufficient  confi- 
dence level; 

• the  method  of  random  balance  is  primarily  meant  for  researchers  in  labs  and 
pilot-plants  since  in  full-scale  plants  it  requires  disrupting  of  the  current  pro- 
cess conditions. 

Advantages  and  disadvantages  of  the  method  are  shown  in  Table  2.25: 
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Table  2.25  Advantages  and  disadvantages  of  random  balance  method 


No.  Advantages 

Disadvantages 

1 

Possibility  to  include  many  factors,  decrease 

Unclear  determination  of  risk  and  reduction 

in  risk  to  omit  a factor 

of  chance  to  omit  a factor 

2 

Assumes  exponential  ranking  of  factors, 

Difficult  to  expect  exponential  distribution  of 

which  seems  natural 

factors 

3 

Random  design,  simple  designing  is  used, 

System  designs,  standard  designs, 

experiment  are  less  strict 

orthogonality  are  all  better 

4 

Few  design  points  necessary,  designs 

Not  enough  information,  effect  estimates 

oversaturated 

mixed 

5 

Error  slightly  bigger  but  the  problem  is 
screening  most  significant  factors 

Less  significant  factors  are  hard  to  detect 

6 

Simple  graphic  analysis 

It  is  present  with  other  methods 

7 

Possibility  to  vary  factors  on  several  levels 

This  complicates  interaction 

8 

Numerous  examples  of  application 

No  theoretical  ground  dangerous  to 
recommend  it 

9 

Simple  model 

More  complex  model  needed 

For  practical  application  several  prepared  design  matrixes  by  the  method  of  ran- 
dom balance  [18]— [20]  are  recommended. 

Table  2.26  k=8 


No. 

D.P. 

Factors 

X, 

X2 

X3 

>< 

x5 

X6 

X7 

X8 

1 

- 

- 

+ 

+ 

- 

+ 

- 

+ 

2 

+ 

+ 

+ 

+ 

+ 

- 

- 

+ 

3 

- 

+ 

+ 

- 

+ 

- 

+ 

- 

4 

+ 

- 

- 

+ 

+ 

+ 

+ 

+ 

5 

- 

+ 

- 

+ 

- 

- 

- 

- 

6 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

7 

- 

- 

- 

- 

- 

+ 

+ 

- 

8 

+ 

- 

+ 

- 

+ 

+ 

- 

- 
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Table  2.27  k=l  0 


No. 

D.P. 

Factors 

X, 

X2 

x3 

>< 

x5 

x6 

X7 

X8 

x9 

o 

>< 

1 

- 

- 

- 

- 

- 

- 

- 

- 

+ 

+ 

2 

+ 

+ 

- 

- 

- 

+ 

+ 

- 

- 

- 

3 

+ 

- 

+ 

+ 

+ 

+ 

- 

+ 

- 

- 

4 

- 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

- 

5 

+ 

- 

+ 

- 

- 

+ 

- 

- 

+ 

- 

6 

- 

+ 

+ 

- 

- 

+ 

+ 

+ 

- 

+ 

7 

- 

- 

- 

+ 

+ 

+ 

- 

- 

- 

+ 

8 

+ 

+ 

- 

+ 

+ 

+ 

- 

+ 

+ 

+ 

9 

+ 

- 

- 

- 

+ 

- 

+ 

+ 

- 

- 

10 

- 

+ 

- 

- 

+ 

- 

+ 

- 

+ 

- 

11 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

+ 

+ 

12 

+ 

+ 

+ 

+ 

- 

- 

- 

+ 

- 

- 

13 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

+ 

+ 

14 

- 

+ 

- 

+ 

- 

- 

- 

+ 

+ 

- 

15 

- 

- 

+ 

- 

+ 

- 

+ 

- 

- 

- 

16 

+ 

+ 

+ 

- 

+ 

+ 

+ 

+ 

+ 

- 

Table  2.28  k=5 


No. 

D.P. 

Factors 

X, 

X2 

x3 

>< 

x5 

1 

+ 

+ 

+ 

- 

- 

2 

+ 

+ 

- 

- 

+ 

3 

+ 

- 

+ 

+ 

- 

4 

+ 

- 

- 

+ 

+ 

5 

- 

+ 

+ 

+ 

+ 

6 

- 

+ 

- 

+ 

- 

7 

- 

- 

+ 

- 

+ 

8 

- 

- 

- 

- 

- 
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Table  2.29  k=l  2 


No. 

D.P. 

Factors 

X, 

X2 

X, 

>< 

X5 

x6 

x7 

X8 

x9 

Xio 

Xn 

rs 

>< 

1 

- 

+ 

- 

+ 

+ 

+ 

+ 

- 

+ 

+ 

+ 

+ 

2 

- 

- 

+ 

- 

- 

- 

- 

- 

+ 

+ 

+ 

- 

3 

- 

+ 

+ 

- 

+ 

- 

- 

- 

- 

- 

+ 

+ 

4 

+ 

- 

- 

- 

+ 

+ 

- 

- 

- 

+ 

- 

- 

5 

+ 

- 

- 

+ 

- 

- 

- 

+ 

- 

+ 

+ 

+ 

6 

- 

+ 

- 

- 

+ 

- 

- 

- 

+ 

- 

- 

- 

7 

+ 

+ 

- 

- 

- 

+ 

+ 

- 

+ 

- 

- 

+ 

8 

- 

- 

+ 

+ 

+ 

- 

- 

+ 

+ 

+ 

- 

- 

9 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

+ 

- 

+ 

+ 

10 

- 

+ 

+ 

- 

- 

- 

+ 

+ 

+ 

+ 

+ 

- 

11 

+ 

+ 

- 

+ 

+ 

+ 

- 

- 

- 

+ 

- 

+ 

12 

- 

- 

- 

- 

- 

+ 

+ 

+ 

- 

+ 

+ 

- 

13 

+ 

+ 

+ 

+ 

+ 

+ 

- 

- 

- 

+ 

- 

+ 

14 

- 

- 

- 

+ 

- 

+ 

+ 

- 

- 

- 

- 

- 

15 

- 

- 

- 

+ 

- 

+ 

- 

+ 

+ 

- 

- 

+ 

16 

+ 

- 

- 

- 

+ 

- 

- 

- 

+ 

+ 

+ 

- 

17 

+ 

+ 

+ 

+ 

- 

+ 

- 

+ 

+ 

+ 

- 

+ 

18 

+ 

- 

+ 

- 

+ 

- 

+ 

+ 

- 

- 

- 

- 

19 

+ 

- 

+ 

- 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

20 

- 

+ 

- 

+ 

- 

+ 

+ 

+ 

+ 

+ 

- 

- 

21 

- 

+ 

+ 

- 

+ 

+ 

+ 

- 

- 

- 

+ 

+ 

22 

+ 

- 

+ 

+ 

+ 

- 

- 

- 

+ 

- 

+ 

- 

23 

+ 

+ 

+ 

+ 

- 

+ 

+ 

+ 

- 

- 

- 

+ 

24 

+ 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

- 

+ 

- 
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Table  2.30  k=14 


No. 

D.P. 

Factors 

X! 

x2 

X3 

>< 

X5 

X6 

x7 

Xg 

x9 

X-io 

Xu 

x12 

x13 

x14 

1 

- 

- 

+ 

+ 

+ 

- 

+ 

- 

+ 

- 

+ 

+ 

- 

+ 

2 

+ 

- 

- 

+ 

- 

- 

- 

- 

- 

+ 

- 

- 

- 

+ 

3 

- 

+ 

- 

+ 

+ 

- 

- 

+ 

+ 

+ 

+ 

- 

+ 

- 

4 

+ 

+ 

+ 

+ 

+ 

+ 

- 

- 

+ 

+ 

- 

+ 

- 

+ 

5 

- 

- 

- 

- 

- 

+ 

+ 

+ 

- 

- 

+ 

- 

- 

- 

6 

+ 

- 

+ 

- 

+ 

+ 

+ 

- 

+ 

+ 

+ 

- 

+ 

+ 

7 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

+ 

+ 

- 

+ 

+ 

- 

8 

+ 

+ 

- 

- 

- 

- 

- 

+ 

- 

- 

+ 

- 

- 

- 

9 

- 

- 

+ 

- 

- 

+ 

- 

+ 

- 

+ 

+ 

+ 

+ 

+ 

10 

+ 

- 

- 

- 

- 

+ 

- 

- 

- 

- 

- 

- 

+ 

+ 

11 

- 

+ 

- 

- 

+ 

- 

+ 

- 

+ 

- 

- 

- 

- 

- 

12 

+ 

+ 

+ 

- 

+ 

- 

- 

- 

- 

- 

- 

- 

+ 

- 

13 

- 

- 

- 

+ 

+ 

+ 

- 

+ 

- 

+ 

+ 

+ 

+ 

- 

14 

+ 

- 

+ 

+ 

- 

- 

+ 

+ 

+ 

- 

+ 

+ 

- 

- 

15 

- 

+ 

+ 

+ 

- 

- 

+ 

+ 

+ 

+ 

- 

+ 

+ 

+ 

16 

+ 

+ 

- 

+ 

+ 

+ 

+ 

+ 

- 

- 

- 

+ 

- 

+ 

Table  2.31 

k=l  9 

No. 

Factors 

D.P. 

X, 

X2 

X3 

X4 

X5 

X6 

X7 

X8 

X9 

Xio 

Xu 

x12 

x13 

>< 

Xl5 

to 

>< 

X17  x18  x19 

1 

- 

- 

+ 

+ 

+ 

- 

+ 

+ 

+ 

- 

+ 

- 

- 

- 

- 

- 

- - 

- 

2 

+ 

- 

- 

+ 

+ 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

+ 

+ 

- 

+ + 

- 

3 

- 

+ 

- 

+ 

+ 

+ 

- 

+ 

+ 

+ 

+ 

- 

+ 

- 

- 

- 

+ + 

+ 

4 

+ 

+ 

+ 

+ 

+ 

- 

+ 

- 

+ 

- 

+ 

+ 

+ 

- 

+ 

+ 

- + 

- 

5 

- 

- 

- 

- 

+ 

+ 

- 

+ 

+ 

+ 

- 

- 

- 

+ 

+ 

+ 

+ - 

+ 

6 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

- 

- 

+ 

- 

- 

+ 

+ 

- 

- + 

+ 

7 

- 

+ 

+ 

- 

+ 

- 

+ 

+ 

- 

- 

- 

+ 

+ 

- 

- 

- 

+ + 

- 

8 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

- 

- 

+ 

- 

+ 

- 

+ 

+ 

+ - 

- 

9 

- 

- 

+ 

- 

- 

- 

- 

+ 

+ 

+ 

- 

- 

+ 

- 

+ 

+ 

+ - 

+ 

10 

+ 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- - 

+ 

11 

- 

+ 

- 

- 

- 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

- 

+ 

+ 

- 

- + 

- 

12 

+ 

+ 

+ 

- 

- 

- 

- 

+ 

- 

+ 

+ 

- 

+ 

+ 

- 

+ 

- - 

- 

13 

- 

- 

- 

+ 

- 

+ 

+ 

- 

+ 

- 

- 

+ 

+ 

+ 

- 

+ 

- + 

+ 

14 

+ 

- 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

+ 

- 

- 

- 

- 

+ 

+ - 

- 

15 

- 

+ 

+ 

+ 

- 

+ 

+ 

+ 

- 

+ 

+ 

+ 

- 

- 

+ 

+ 

- + 

+ 

16 

+ 

+ 

- 

+ 

- 

+ 

+ 

+ 

+ 

+ 

- 

+ 

- 

- 

+ 

- 

+ - 

+ 

17 

- 

- 

+ 

+ 

- 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

- 

+ - 

- 

Table  2.31  Continued 
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No. 

D.P. 

Factors 

X, 

X2 

X3 

X4 

X5 

X6 

X7 

Xg 

X9 

O 

>< 

Xn 

X12 

x13 

■'t 

>< 

Xl5 

VD 

>< 

Xl7 

^18 

as 

>< 

18 

+ 

- 

- 

+ 

- 

+ 

- 

- 

- 

+ 

+ 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

19 

- 

+ 

- 

+ 

- 

+ 

- 

- 

- 

- 

+ 

+ 

- 

- 

+ 

- 

+ 

+ 

+ 

20 

+ 

+ 

+ 

+ 

- 

- 

+ 

- 

+ 

- 

- 

+ 

- 

- 

+ 

+ 

- 

- 

- 

21 

- 

- 

- 

- 

- 

- 

+ 

- 

- 

- 

- 

- 

+ 

- 

- 

+ 

+ 

+ 

- 

22 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

- 

- 

- 

- 

- 

+ 

- 

+ 

- 

+ 

+ 

23 

- 

+ 

+ 

- 

- 

+ 

- 

- 

- 

+ 

- 

- 

- 

- 

+ 

- 

+ 

+ 

+ 

24 

+ 

+ 

- 

- 

- 

+ 

+ 

+ 

- 

+ 

- 

+ 

+ 

+ 

+ 

- 

- 

+ 

- 

25 

- 

- 

+ 

- 

+ 

- 

+ 

+ 

- 

+ 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

- 

26 

+ 

- 

- 

- 

+ 

- 

+ 

+ 

+ 

- 

+ 

+ 

+ 

+ 

- 

- 

+ 

- 

+ 

27 

- 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

+ 

- 

+ 

- 

+ 

+ 

+ 

+ 

28 

+ 

+ 

+ 

- 

+ 

- 

+ 

- 

- 

- 

- 

- 

+ 

+ 

- 

- 

- 

+ 

+ 

29 

- 

- 

- 

+ 

+ 

+ 

- 

- 

+ 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

- 

+ 

30 

+ 

- 

+ 

+ 

+ 

- 

- 

+ 

+ 

+ 

- 

- 

+ 

+ 

- 

+ 

- 

- 

+ 

31 

- 

+ 

+ 

+ 

+ 

- 

- 

- 

+ 

- 

- 

+ 

- 

+ 

+ 

+ 

- 

+ 

- 

32 

+ 

+ 

- 

+ 

+ 

+ 

+ 

- 

+ 

+ 

- 

+ 

+ 

- 

+ 

+ 

+ 

- 

+ 

It  should  be  noted  that  when  doing  an  experiment  by  the  method  of  random  bal- 
ance, it  is  not  necessary  to  have  a random  order  of  doing  design  points,  since  the 
randomization  principle  has  been  introduced  in  constructing  the  matrix. 

Example  2.1 1 

Adhesion  on  “HLORIN”-type  fibers,  has  been  studied  as  a function  of  five  process 
factors.  The  names  of  factors,  with  their  variation  levels,  are  shown  in  Table  2.32. 
Matrix  2 3 of  full  factorial  experiment  has  been  used  in  constructing  random  balance 
matrix.  The  design  matrix  by  the  method  of  random  balance  with  experimental 
results  is  shown  in  Table  2.33.  Note  that  each  design  point  was  repeated  20  to  50 
times  due  to  high  non  reproducibility  of  the  system. 

Table  2.32  Variation  levels 


Factors 

Levels 

- 

+ 

X1  pressing  temperature  °C 

140 

170 

X2  pressing  pressure  kp/cm2 

5 

20 

X3  time  of  pressing  min 

0.5 

2.5 

X4  time  of  preprocessing  min 

1 

3 

X5  fiber  nature 

Viscose 

Capronate 
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Table  2.33  Design  matrix 


No. 

D.P. 

Factors 

Response 

y 

Xr 

x2 

x3 

x4 

X5 

1 

+ 

+ 

+ 

- 

- 

32.6 

2 

+ 

+ 

- 

- 

+ 

15.2 

3 

+ 

- 

+ 

+ 

- 

18.9 

4 

+ 

- 

- 

+ 

+ 

15.2 

5 

- 

+ 

+ 

+ 

+ 

14.6 

6 

- 

+ 

- 

+ 

- 

33.1 

7 

- 

- 

+ 

- 

+ 

14.0 

8 

- 

- 

- 

- 

- 

24.5 

Scatter  diagram  was  constructed  based  on  the  results  from  Table  2.33.  The  effects 
of  factors  X2  and  X5  were  visually  screened  out  from  the  scatter  diagram  shown  in 
Fig.  2.22 


y 

50 
40 
30 
20 

10 
0 

Figure  2.22  Scatter  diagram 

Table  2.34  is  formed  for  the  two  separated  factors  with  two  inlets.  Effects  of  factors 
X2  and  X5  are  quantitatively  determined  from  the  table.  Then,  a significance  check 
of  obtained  effects  is  done  by  the  Students  t-criterion. 




- 

. 

I_1 

1 

-X,  + -X,  + -X,  + -X.  + -X5  + 


Table  2.34  Table  with  two  inputs 


+x2 

-x2 

+X5 

15.2 

Eh 

= 29.8 

15.2 

Eft 

= 29.2 

14.6 

h 

= 14.9 

14.0 

ft 

= 14.6 

■X5 

33.1 

Eft 

= 65.7 

24.5 

EY4 

= 43.4 

32.6 

ft 

= 32.9 

18.9 

ft 

= 21.7 
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E 2 h+%  14-9+14.6  32.9+21.7 


_yi+y3  y2+y4  _ 14.9+32.9  14.6+21.7  _ 


Ex  = 

2 2 2 
Significance  check  of  the  effects: 


= +5.72 


_ 14.9+14.6-32.9-21.7  _ c_ 

j — 0.5/ 

x/8+2 


_ fo+ftHft+ft)  _ 14.9+32.9-14.6-21.7  _ g2 

\/8732 


yi ; y2 ; y3 ; 74  are  response  means  by  cells  of  auxiliary  table; 
is  response  variance  by  cells; 
is  number  of  design  points  in  each  cell. 

2 


Sr 


The  valued  Sr  j ni is  obtained  by  calculations  shown  in  Table  2.35. 

Table  2.35  Calculation  of  variance  Sr 


Number  of 
cells 

n j 

Ett 

(E  Y:)2 

EYi 

s2_Eyf  (Ey;)2 

R m — 1 n.(n.—  l) 

s2 

£r 

n. 

1 

2 

29.8 

888.04 

444.2 

016 

0.08 

2 

2 

29.2 

852.64 

427.4 

1.08 

0.54 

3 

2 

65.7 

4316.49 

2158.4 

0.12 

0.06 

4 

2 

43.4 

1883.56 

957.5 

15.68 

7.84 

E 

8 

8.52 

The  number  of  degrees  of  freedom  is  the  difference  between  the  total  number  of 
experimental  design  points  and  the  number  of  cells,  in  this  case  it  is: 

f =8-4=4. 

For  the  threshold  or  significance  level  a=0.05  and  for  f=4  we  have  the  tabular  val- 
ue of  Students  criterion  tT=2.78.  Since  the  calculated  values  are  above  The  tabular 
values  for  the  T-test,  the  separated  effects  are  statistically  significant  with  95%  confi- 
dence. The  next  step  was  correction  of  response  values  for  -5.72  and  +12.52  in  those 
design  points  where  X2  and  X5  are  in  the  upper  levels.  After  a response  correction  a 
new  scatter  diagram  was  constructed  for  the  factors  and  even  interactions.  X3  and 
X2X5  were  visually  separated  with  their  effects  -5.42  and  -1.98,  respectively.  The  sig- 
nificance check  of  these  effects  showed  that  X2X5  had  95%  and  X3  90%  confidence 
levels.  After  the  correction  of  corrected  responses  with  the  effects  of  factors  X3  and  X2X5, 
the  variance  of  twice-corrected  responses  Sr=1.27,  approached  the  reproducibility  var- 
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iance  Sr=0.50.  A check  of  Fisher’s  criterion  offered  the  value  FR=1. 27/0.50=2. 54.  Since 
the  tabular  value  is  F0.o5=2.36,  further  process  of  screening  factors  is  stopped. 

Table  2.36  Summarized  results 


Screening  phase 

Visual  screening 

Effect  values 

t>i 

Initial  response  values 

x5 

-12.52 

8.57 

-6.26 

X2 

+5.75 

3.92 

+2.86 

First  response  correction 

X5X3 

-5.42 

6.69 

-2.71 

-1.98 

2.22 

-0.99 

Second  response  correction 

X4Xj 

-1.97 

1.54 

-0.54 

-1.13 

1.62 

-0.57 

Hence,  the  adhesion  of  fibers  is  significantly  affected  by  factors  X5;  X2  and  X3. 
The  expected  effect  of  X3  factor  proved  to  be  marginal.  This  may  be  explained  only 
by  the  fact  that  its  value  in  the  center  of  the  performed  experiment  is  either  maximal 
or  its  variation  interval  has  been  badly  chosen.  Further  experiments  have  proved  the 
first  assumption.  The  summarized  result  of  the  method  of  random  balance  is  shown 
in  Table  2.36. 

Example  2.12  [18] 

Factors  were  not  screened  out  in  Example  2.8  by  the  method  of  prior  ranking,  so 
that  a matrix  of  random  balance  was  constructed  for  all  eight  factors,  Table  2.37.  The 
experiment  was  done  by  one  replication  of  the  design  point  in  order  to  establish  the 
variance  of  system  reproducibility. 

Table  2.37  Design  of  experiment 


Factors 


Responses 


Basic  level 

60 

50 

60 

1200 

15 

15 

6 

310 

Variation  interval 

10 

30 

40 

400 

5 

5 

4 

305 

Upper  level  (+) 

70 

80 

100 

1600 

20 

20 

10 

615 

Lower  level  (-) 

50 

20 

20 

800 

10 

10 

2 

5 

Number  of  design 

x. 

x2 

X3 

X4 

x5 

Xs 

X7 

x8 

yoi 

yo2 

y 

points 

1 

- 

- 

+ 

+ 

- 

+ 

- 

+ 

528.00 

540.80 

534.40 

2 

+ 

+ 

+ 

+ 

+ 

- 

- 

+ 

345.60 

345.30 

345.45 

3 

- 

+ 

+ 

- 

+ 

- 

+ 

- 

512.00 

521.60 

516.80 

4 

+ 

- 

- 

+ 

+ 

+ 

+ 

+ 

713.60 

576.00 

644.80 

5 

- 

+ 

- 

+ 

- 

- 

- 

- 

1067.20 

912.00 

989.60 

6 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

870.40 

832.00 

851.20 

7 

- 

- 

- 

- 

- 

+ 

+ 

- 

1648.00 

1657.60 

1652.80 

8 

+ 

- 

+ 

- 

+ 

+ 

- 

- 

572.80 

454.40 

513.60 

Note:  Sy2=3678.89; 

f =8(2T): 

=8 

2.2  Screening  Experiments  | 219 

The  scatter  diagram  was  constructed  from  the  obtained  response  values,  Fig.  2.23. 


-X-,  + -x2  + -x3  + -x4  + -x5  + -x6  + -x7  + -x8  + 


Figure  2.23  Scatter  diagram 

Based  on  the  scatter  diagram  factors,  X3  and  X5  are  visually  screened  out.  A table 
with  two  inputs  is  formed  for  these  factors.  Quantitative  effects  are  calculated  from 
the  auxiliary  table  and  then  checked  by  the  t-test.  Since  both  effects  are  significant, 
the  first  correction  is  done  and  then  the  procedure  is  repeated.  Factors  X4  and  X8  are 
visually  screened  out  in  the  second  step.  Quantitative  calculation  of  effects  and  t-test 
show  that  only  factor  X8  is  significant.  The  effect  of  this  factor  is  annulled  by  the 
second  response  correction.  Now,  interactions  are  introduced  into  the  scatter  dia- 
grams so  that  XjX7  and  X5X8  are  visually  screened  out.  Quantitative  values  of  the 
effects  of  these  two  interactions  were  not  statistically  important.  Since  Table  2.38 
shows  that  the  variance  of  corrected  responses  is  significantly  close  to  reproducibili- 
ty variance,  the  selection  procedure  is  over.  The  diagram  in  Fig.  2.24,  which  clearly 
shows  a reduction  in  residual  variance  (Sr),  is  especially  important  for  the  method 
of  random  balance. 


y y'  y"  y'" 

Figure  2.24  Corrected  responses 
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Table  2.38  Corrected  responses 


No. 

D.P. 

Factors 

Corrected 

responses 

Xi 

x2 

X3 

x4 

x5 

X6 

X7 

X8 

Y 

! 

Y 

// 

Y 

III 

Y 

1 

- 

- 

+ 

+ 

- 

+ 

- 

+ 

534.40 

942.56 

1204.85 

1428.83 

2 

+ 

+ 

+ 

+ 

+ 

- 

- 

+ 

345.45 

1051.37 

1313.66 

1313.66 

3 

- 

+ 

+ 

- 

+ 

- 

+ 

- 

516.80 

1222.72 

1222.72 

1222.72 

4 

+ 

- 

- 

+ 

+ 

+ 

+ 

+ 

644.80 

942.56 

1204.85 

1428.83 

5 

- 

+ 

- 

+ 

- 

- 

- 

- 

989.60 

989.60 

989.60 

1213.58 

6 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

851.20 

851.20 

1113.49 

1337.47 

7 

- 

- 

- 

- 

- 

+ 

+ 

- 

1652.80 

1652.80 

1652.80 

1652.80 

8 

+ 

- 

+ 

- 

+ 

+ 

- 

- 

513.60 

1219.52 

1219.52 

1219.52 

Y 

756.08 

1109.04 

1240.19 

1352.18 

Sr 

173411.41  65758.33 

36817.62 

22484.50 

F=  Sl/Sy 

47.14 

17.87 

10.01 

6.11 

Note:  Sy2=3678.89;  f= 

=8;  Reproducibility  variance  FT(7.8;0  99)=6.18 

Example  2.13  [15] 

Due  to  the  advantages  and  disadvantages  of  the  method  of  random  balance,  which 
have  been  mentioned  in  this  section,  a demonstration  of  efficiency  of  the  method 
will  be  given  in  this  example  on  an  artificially  constructed  problem  and  where  we 
know,  in  advance,  the  effects  that  should  be  screened.  It  will  also  be  shown  that,  gen- 
erally speaking,  the  method  of  random  balance  with  more  than  two  levels  of  factor 
variation  has  no  advantage.  A demand  for  more  than  two  levels  is  justified  only  in 
cases  with  qualitative  factors. 

Assume  that  twelve  factors,  X2  to  X12,  should  be  screened.  The  random  balance 
matrix  will  consist  of  two  independent  semi-replicas  of  a 26  full  factorial  experiment, 
with  rows  or  design  points  that  are  randomly  distributed.  The  32  design  points  thus 
synthesized  will  start  with  the  values  taken  from  a normal  population  with  the 
mean  100  and  the  standard  deviation  60=2.0.  The  effects  of  factors  have  been  intro- 
duced in  the  way  that  the  following  values  were  added  to  the  best  values  of  selected 
factors  in  the  upper  level  (+): 
value  -15  added  to  factor  X7 
value  -12  added  to  factor  X4 
value  +10  added  to  factor  X10  and  Xn 
value  +8  added  to  factor  Xl 
value  +6  added  to  factor  X5  and  Xg 
value  +4  added  to  factor  X2 
value  -4  added  to  factor  X9 

Table  2.39  shows  groupings  of  factors  with  associated  semireplicas  of  26  full  fac- 
torial design. 
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Table  2.39 


Semireplica  of  full  factorial  design  2 


6 


First  group  of  factors 

x. 

X2 

X3 

x4 

X5 

X6 

Secon  group  of  factors 

x7 

Xg 

x9 

X-io 

Xn 

x12 

1 

+ 

+ 

+ 

+ 

+ 

+ 

2 

+ 

+ 

+ 

+ 

- 

- 

3 

+ 

+ 

+ 

- 

+ 

- 

4 

+ 

+ 

+ 

- 

- 

+ 

5 

+ 

+ 

- 

+ 

+ 

- 

6 

+ 

+ 

- 

+ 

- 

+ 

7 

+ 

+ 

- 

- 

+ 

+ 

8 

+ 

+ 

- 

- 

- 

- 

9 

+ 

- 

+ 

+ 

+ 

- 

10 

+ 

- 

+ 

+ 

- 

+ 

11 

+ 

- 

+ 

- 

+ 

+ 

12 

+ 

- 

+ 

- 

- 

- 

13 

+ 

- 

- 

+ 

+ 

+ 

14 

+ 

- 

- 

+ 

- 

- 

15 

+ 

- 

- 

- 

+ 

- 

16 

+ 

- 

- 

- 

- 

+ 

17 

- 

+ 

+ 

+ 

+ 

- 

18 

- 

+ 

+ 

+ 

- 

+ 

19 

- 

+ 

+ 

- 

+ 

+ 

20 

- 

+ 

+ 

- 

- 

- 

21 

- 

+ 

- 

+ 

+ 

+ 

22 

- 

+ 

- 

+ 

- 

- 

23 

- 

+ 

- 

- 

+ 

- 

24 

- 

+ 

- 

- 

- 

+ 

25 

- 

- 

+ 

+ 

+ 

+ 

26 

- 

- 

+ 

+ 

- 

- 

27 

- 

- 

+ 

- 

+ 

- 

28 

- 

- 

+ 

- 

- 

+ 

29 

- 

- 

- 

+ 

+ 

- 

30 

- 

- 

- 

+ 

- 

+ 

31 

- 

- 

- 

- 

+ 

+ 

32 

- 

- 

- 

- 

- 

- 

Table  2.40  shows  a complete  design  matrix  by  the  method  of  random  balance; 
original  data  taken  from  normal  population  y0;  y synthesized  response  to  which  val- 
ues of  effects  were  added  and  the  phases  of  factor  screening  with  corrected  response 
y1;  y11;  yln  and  yIV  and  their  standard  deviations. 

The  catter  diagram,  Fig.  2.25,  has  been  constructed  for  response  y.  Medians  for 
upper  and  lower  levels  of  associated  factors  were  used  on  the  same  diagrams  in 
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order  to  select  the  factors  visually.  Median  or  effect  value  differences  were  also  given 
on  the  scatter  diagram. 


Table  2.40  Design  of  experiment 


No. 

D.P. 

Combinations 

Factors 

yo 

Responses 

xrx6 

X7-X12 

Xi 

X2 

X3 

X4 

X5 

x6 

x7 

X8 

x9 

Xio 

Xn 

x12 

y 

Y1 

y" 

y'" 

ylv 

1 

18 

7 

- 

+ 

+ 

+ 

- 

+ 

+ 

+ 

- 

- 

+ 

+ 

101 

78 

103 

103 

103 

98 

2 

8 

32 

+ 

+ 

- 

- 

- 

- 

- 

- 

- 

- 

- 

- 

99 

103 

92 

101 

96 

96 

3 

4 

6 

+ 

+ 

+ 

- 

- 

+ 

+ 

+ 

- 

+ 

- 

+ 

100 

97 

99 

99 

99 

94 

4 

6 

16 

+ 

+ 

- 

+ 

- 

+ 

+ 

- 

- 

- 

- 

+ 

100 

77 

91 

100 

95 

95 

5 

3 

19 

- 

+ 

+ 

- 

+ 

- 

- 

+ 

+ 

- 

+ 

+ 

99 

113 

102 

102 

97 

97 

6 

23 

5 

- 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

+ 

+ 

- 

96 

77 

90 

99 

94 

94 

7 

17 

4 

- 

+ 

+ 

+ 

+ 

- 

+ 

+ 

+ 

- 

- 

+ 

97 

66 

91 

100 

95 

95 

8 

26 

27 

- 

- 

+ 

+ 

- 

- 

- 

- 

+ 

- 

+ 

- 

99 

93 

105 

105 

100 

100 

9 

31 

2 

- 

- 

- 

- 

+ 

+ 

+ 

+ 

+ 

+ 

- 

- 

99 

86 

99 

99 

94 

99 

10 

15 

28 

+ 

- 

- 

- 

+ 

- 

- 

- 

+ 

- 

- 

+ 

99 

93 

82 

91 

91 

96 

11 

28 

3 

- 

- 

+ 

- 

- 

+ 

+ 

+ 

+ 

- 

+ 

- 

102 

87 

100 

100 

100 

100 

12 

16 

25 

+ 

- 

- 

- 

- 

+ 

- 

- 

+ 

+ 

+ 

+ 

100 

100 

89 

97 

92 

97 

13 

30 

13 

- 

- 

- 

+ 

- 

+ 

+ 

- 

- 

+ 

+ 

+ 

103 

72 

97 

106 

101 

101 

14 

11 

22 

+ 

- 

+ 

- 

+ 

+ 

- 

+ 

- 

+ 

- 

- 

106 

124 

113 

113 

108 

103 

15 

27 

8 

- 

- 

+ 

- 

+ 

- 

+ 

+ 

- 

- 

- 

- 

98 

83 

96 

105 

100 

95 

16 

19 

15 

- 

+ 

+ 

- 

+ 

+ 

+ 

- 

- 

- 

+ 

- 

99 

88 

101 

101 

101 

96 

17 

14 

23 

+ 

- 

- 

+ 

- 

- 

- 

+ 

- 

- 

+ 

- 

100 

96 

97 

97 

97 

97 

18 

20 

18 

- 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

+ 

- 

+ 

100 

100 

100 

100 

100 

100 

19 

22 

20 

- 

+ 

- 

+ 

- 

- 

- 

+ 

+ 

- 

- 

- 

98 

72 

84 

93 

93 

98 

20 

32 

17 

- 

- 

- 

- 

- 

- 

- 

+ 

+ 

+ 

+ 

- 

100 

86 

86 

95 

95 

100 

21 

25 

1 

- 

- 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

100 

69 

94 

103 

98 

98 

22 

2 

26 

+ 

+ 

+ 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

- 

101 

103 

104 

104 

99 

99 

23 

1 

11 

+ 

+ 

+ 

+ 

+ 

+ 

- 

- 

+ 

- 

+ 

+ 

96 

77 

91 

91 

91 

91 

24 

7 

21 

+ 

+ 

- 

- 

+ 

+ 

- 

+ 

- 

+ 

+ 

+ 

100 

104 

93 

102 

102 

97 

25 

21 

30 

- 

+ 

- 

+ 

+ 

+ 

- 

- 

- 

+ 

- 

+ 

96 

84 

96 

96 

96 

96 

26 

12 

12 

+ 

- 

+ 

- 

- 

- 

+ 

- 

+ 

- 

- 

- 

100 

89 

91 

100 

95 

95 

27 

24 

24 

- 

+ 

- 

- 

- 

+ 

- 

+ 

- 

- 

- 

+ 

102 

92 

92 

101 

101 

101 

28 

10 

9 

+ 

- 

+ 

+ 

- 

+ 

+ 

- 

+ 

+ 

+ 

- 

101 

78 

92 

101 

96 

96 

29 

5 

29 

+ 

+ 

- 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

- 

101 

87 

88 

97 

97 

97 

30 

9 

10 

+ 

- 

+ 

+ 

+ 

- 

+ 

- 

+ 

+ 

- 

+ 

100 

81 

95 

95 

95 

95 

31 

13 

31 

+ 

- 

- 

+ 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

103 

99 

100 

100 

100 

100 

32 

29 

14 

- 

- 

- 

+ 

+ 

- 

+ 

- 

- 

+ 

- 

- 

99 

72 

97 

97 

97 

97 

Standard  deviation  sr 

2.1 

13.5 

6.6 

4.3 

3.6 

2.6 
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Factors  X3;  X4  and  X7  with  their  respective  effects  EXi=13.0,  EX4=14.5  and 
Ex7=19.5  are  also  screened  out  visually  from  the  scatter  diagram.  A statistical  signifi- 
cance check  of  these  effects  is  done  by  quantitative  calculation  of  their  values 
through  a table  with  three  inputs  and  a check  with  the  Students  t-test.  An  auxiliary 
table  with  three  inputs  is  shown  in  Table  2.41. 


Table  2.41  Auxiliary  table  with  three  inputs 


-x4 

+X4 

-X7 

+x7 

-X7 

+x7 

124 

97 

103 

81 

+Xi 

113 

89 

99 

78 

104 

96 

77 

103 

87 

77 

100 

93 

Eli  =637 

Ey2  = 186 

Els  = 385 

El4  = 313 

y4  = 106.17 

y2  = 93.00 

y3  = 96.25 

y4  = 78.25 

100 

88 

93 

78 

-X, 

92 

87 

84 

72 

86 

86 

72 

72 

83 

69 

77 

69 

Els  =278 

E 16=421 

El7  =249 

Els  = 357 

y5  = 92.67 

y6  = 84.20 

y7  = 83.00 

y8  = 71.40 
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The  effects  of  factors  are  calculated  thus: 


E __h+h+h+%  Vs+yG+Fy+yg 

X1  4 4 

106.17+93.00+96.25+78.25  92.67+84.20+83.00+71.40 


= +10.60 


EXl  = 


_y3+74+y7+y8  71+72+^+76 


96.25+78.25+83.00+71.40 

4 


106.17+93.00+92.67+84.20 


= -11.87 


E 72  +74  +76+78  7i+73+y5+77_ 

x7  4 4 

93.00+78.25+84.20+71.40  106.17+96.25+92.67+83.00 


= -12.81 


The  check  of  effects  is  done  by  the  Students  t-criterion.  Arithmetic  values  of  the  t- 
test  are  determined  from  these  relations: 


(2.37) 


(2.38) 


(y2  +74  +76  +78)-(7i+73+75+77) 


(2.39) 


By  comparison  of  arithmetic  values  of  the  t-criterion  with  tabular  ones,  we  can 
see  that  the  effects  of  factors  X1;  X4  and  X7  are  statistically  significant.  Screening 
then,  continues  by  response  correction  or  by  annulling  the  screened  out  effects  and 
repeating  the  procedure.  Finally,  the  following  effects  were  screened  out  and  com- 
pared to  additional  effects  in  the  phase  of  defining  this  example. 


Factors 

Given  effects 

Estimated  effects 

Multiple  regression  effect  estimate 

X7 

15 

13 

15.9 

x4 

12 

12 

12.1 

XioXn 

10 

8 

10.1 

X3 

8 

11 

8.7 

x5x8 

6 

5 

6.2 

X3 

4 

4 

4.9 

X, 

4 

5 

5.1 

Former  experience  of  the  author  of  this  book,  in  applying  the  method  of  random 
balance,  indicates  that  there  have  been  no  situations  where  this  method  has  not 
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screened  out  significant  factors.  Even  if  a case  appears  where  no  most  significant 
factors  may  be  selected,  the  response  variance  will  not  be  reduced,  even  after  correct- 
ing the  response.  This  means  that  the  most  important  factors  have  not  been 
included  into  the  experiment.  This,  of  course,  does  not  mean  that  it  is  a method 
error. 

Problem  2.5 

The  method  of  prior  ranking  factors  has  been  unsuccessfully 
applied  on  the  data  in  Problem  2.4  about  factors  that  affect  the  pet- 
roleum oils  refining  procedure  by  phenol.  A design  matrix  was  con- 
structed for  this  reason  and  for  all  sixteen  factors,  for  an  experimen- 
tal screening  by  the  method  of  random  balance.  The  design  matrix 
with  outcomes  of  the  experiment  is  shown  in  Table  2.42.  Process 
the  results  by  the  method  of  random  balance. 

2.2.3 

Active  Screening  Experiment  Plackett-Burman  Designs 

There  is  annother  standard  two-level  design  as  active  screening  design,  which  the 
literature  recommends,  that  provides  the  choice  of  4,  8,  16,  32  or  more  trials-runs, 
but  only  the  power  of  two.  In  1946,  Plackett  and  Burman  [64]  invented  the  alternative 
two-level  designs  that  are  multiples  of  four.  The  12-,  20-,  and  28-run  Plackett-Burr- 
man  designs  are  of  particular  interest,  because  they  fill  gaps  in  the  standard  designs. 
Unfortunately,  these  particular  Plackett-Burman  designs  have  very  messy  alias  struc- 
tures. For  example,  the  11th  factor  in  the  12-runs  choice,  which  is  very  popular, 
causes  each  main  effect  to  be  partially  aliased  with  45  two-factor  interactions.  In  the- 
ory, you  can  get  away  with  this  if  absolutely  no  interactions  are  present,  but  this  is  a 
very  dangerous  assumption.  Because  of  the  unexpected  aliasing  that  occurs  with 
many  Plackett-Burman  designs,  it  is  recommended  to  avoid  them  in  favor  of  the 
standard  two-level  designs. 

Some  of  the  available  software  packages  for  design  of  experiments  use  Plackett- 
Burman  designs. 

■ Problem  2.6  [12] 

In  an  optimization  process  of  isomerization  of  sulfanilamide,  a 
design  of  experiments  has  been  in  the  first  phase  defined  by  a 
method  of  random  balance  with  the  idea  of  doing  a screening  active 
experiment.  The  design  of  experiments  with  its  results  is  shown  in 
Table  2.43.  Screen  factors  by  significance  of  their  effects  on  the  mea- 
sured value. 


Table  2.42:  Design  of  experiment 
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Table  2.43  Design  of  experiment 
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No. 

D.P. 

Factors 

Responses 

X, 

X2 

X3 

X4 

x5 

x6 

X7 

X8 

x9 

Xio 

Y 

y' 

1 

+ 

+ 

+ 

- 

- 

+ 

- 

- 

+ 

- 

67.5 

2.02 

2 

- 

+ 

- 

- 

- 

+ 

- 

+ 

- 

- 

83.7 

18.20 

3 

+ 

- 

+ 

+ 

- 

+ 

+ 

+ 

+ 

+ 

27.8 

35.31 

4 

+ 

- 

+ 

- 

+ 

+ 

- 

- 

+ 

+ 

21.6 

37.40 

5 

- 

- 

- 

- 

+ 

+ 

- 

+ 

- 

- 

5.0 

12.51 

6 

- 

+ 

- 

+ 

- 

+ 

+ 

- 

+ 

- 

84.8 

19.32 

7 

+ 

+ 

- 

- 

- 

- 

- 

- 

- 

+ 

67.5 

9.75 

8 

- 

- 

+ 

+ 

- 

- 

- 

+ 

- 

+ 

8.5 

16.85 

9 

+ 

- 

- 

+ 

+ 

+ 

+ 

- 

+ 

+ 

9.7 

25.56 

10 

+ 

+ 

+ 

+ 

+ 

- 

+ 

+ 

+ 

- 

70.5 

13.30 

11 

- 

- 

+ 

- 

+ 

- 

- 

- 

+ 

+ 

7.5 

23.36 

12 

+ 

- 

- 

- 

+ 

- 

+ 

- 

- 

- 

7.2 

14.71 

13 

+ 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

- 

70.5 

5.02 

14 

- 

+ 

+ 

+ 

+ 

- 

- 

+ 

- 

+ 

85.2 

35.58 

15 

- 

+ 

+ 

- 

+ 

+ 

+ 

+ 

- 

- 

84.8 

26.83 

16 

- 

- 

- 

+ 

- 

- 

+ 

+ 

- 

+ 

8.0 

16.35 

2.2.3 

Completely  Randomized  Block  Design 

In  empirical  or  experimental  research  it  is  necessary  to  determine  system  stability 
and  reproducibility  of  the  results.  In  the  case  of  an  experiment  with  a larger  number 
of  design  points  there  appears  to  be  a problem  of  providing  the  same  conditions  for 
doing  the  design  points.  The  experiment  is  therefore  designed  so  as  to  do  research 
in  groups  of  design  points  blocks,  where  equality  of  conditions  is  higher  than  in  the 
complete  field  of  research.  In  this  way  we  can  single  out  the  effects  of  inequality  of 
conditions  from  other  factors.  The  design  of  experiments  that  provide  this  are  called 
completely  randomized  block  design.  Such  designs  are  used  in  research  with  a single 
factor  or  they  belong  to  the  single-factor  design  group.  Results  of  an  experiment 
done  by  a randomized  complete  block  design  are  analyzed  by  the  method  of  analysis 
of  variance,  which  has  been  elaborated  in  detail  in  Sect.  1.5. 

Since  these  randomized  blocks  are  applied  to  single  out  inequality  effects  of  a 
research  subject  from  factor  effects,  the  variance  of  analysis  confidence  is  increased 
as  experimental  error  is  diminished.  The  block  denotes  the  part  of  design  points 
where  experimental  error  is  lower  than  in  the  experiment  as  a whole. 

To  screen  out  the  effects  of  systematic  errors,  the  effects  of  factor-level  variations 
are  researched  in  each  block  by  random  order.  This  is  the  origin  of  the  term  rando- 
mized complete-block  design.  These  blocks  originate  from  studies  in  agronomy,  for 
in  it  there  appeared  the  most  drastic  case  of  inequality  of  agricultural  lots  where 
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experiments  have  been  done.  To  eliminate  this  inequality,  the  lots  have  been  divided 
into  blocks  or  more  equal  areas.  The  size  and  number  of  blocks  primarily  depends 
on  the  research  subject  and  possibility  to  equalize  experimental  conditions,  and  they 
are  two  counter-balanced  requirements  on  which  processing  confidence  of  results 
depends.  A small  number  of  blocks  means  simple  calculations  of  analysis  of  var- 
iance but  also  greater  lack  of  confidence  due  to  their  inequality.  When  designs  of 
completely  randomized  blocks  are  used,  an  assumption  is  introduced  that  outcome 
levels  may  be  different  in  different  blocks  but  that  relative  factor  effects  are  the  same 
in  all  blocks.  This  assumption  means  in  practice  that  there  is  no  interaction  between 
blocks  and  factors,  i.e.  even  if  it  exists  it  is  negligible  in  relation  to  the  factor  effect. 
Interaction  is  part  of  experimental  error  and  when  it  is  large,  inferences  of  analysis 
of  variance  are  not  certain.  One  of  the  basic  objectives  of  designing  experiments  by 
completely  randomized  blocks,  when  an  experiment  is  done  in  full-scale  plants,  is 
that  the  time  element  is  reduced.  In  a normal  operation  of  a chemical  plant,  in 
batches  or  continuosly,  systematic  variations  in  product  properties  appear.  Some- 
times these  variations  may  be  explained  by  seasonal  influences,  such  as  change  in 
temperature  of  cooling  water,  for  example,  a change  in  quality  of  raw  materials,  etc. 
However,  frequently  there  are  neither  explanations  for  the  mentioned  fluctuations 
nor  can  they  can  be  controlled.  The  question  is  not  about  temporary,  random  varia- 
tions that  are  considered  normal  in  production,  but  about  slow  changes  in  the  aver- 
age level.  Therefore,  in  designing  an  experiment  in  such  plants,  one  must  take  care 
of  separating  the  effects  of  temporary  trends.  This  does  not  mean  that  the  designs  of 
completely  randomized  blocks  are  limited  only  to  full-scale.  But  on  the  contrary, 
they  are  limited  to  labs  and  pilot  plants  where  trials  for  one  experiment  are  done  in 
a longer  time  period  and  where  there  exists  a probability  of  systematic  variations 
and  trends  [21]. 

Example  2.14(21] 

In  a production  plant,  it  should  be  determined  experimentally  whether  four  kinds  of 
catalyst  preparations  affect  the  yield.  Namely,  four  methods  of  preparing  catalysts  in 
a 6-month  time  period  have  been  researched.  One  should  thereby  know  that  each 
trial  lasts  one  week.  If  each  catalyst  is  being  researched  for  a month,  there  may 
appear  differences  due  to  changes  in  the  efficiency  level  of  the  plant,  a change  in 
raw  materials  or  the  like.  If  each  catalyst  is  tested  for  seven  days  and  all  four  in  later 
weeks,  the  variation  effect  is  less  significant,  but  the  experiment  loses  its  precision, 
since  each  catalyst  is  being  researched  in  a shorter  time  period.  To  profit  from  the 
six  months  available  for  the  experiment,  it  is  designed  in  such  a way  to  split  the 
entire  period  into  six  blocks,  with  testing  all  four  catalysts  in  a months  time.  To  avoid 
the  influence  of  systematic  errors,  the  catalysts  are  tested  within  a month  or  block  in 
a completely  random  order.  When  a linear  change  in  plant  efficiency  during  the 
experiment  is  expected  in  advance,  then  this  inequality  of  research  subject  may  be 
eliminated  by  applying  designs  of  experiments  known  as  latin  squares. 

It  should  be  noted  once  again  that  effects  of  the  four  catalysts  relatively,  with 
respect  to  each  other,  remain  the  same  in  each  month,  i.e.  there  is  no  interaction 
between  blocks  (months)  and  factor  levels  (catalyst  types).  When  such  an  interaction 
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exists,  but  it  is  not  great  when  compared  to  the  factor  effect,  the  experiment  will  be 
satisfactory.  The  sensitivity  will  be  reduced  since  the  experiment  error  has  increased 
for  the  effect  of  interaction. 

Analysis  of  experiment  results  by  design  of  completely  randomized  blocks 

Since  the  experimental  results,  by  design  of  completely  randomized  blocks,  are  pro- 
cessed by  analysis  of  variance,  experimental  results  of  randomized  blocks  will  be 
presented  as  a two-way  classification  and  notation,  as  introduced  in  Sect.  1.5.  We 
only  introduce  the  change  that  the  measured  values  or  response  are  marked  by  y;j 
and  factors  by  Xjj.  Design  of  completely  randomized  block  structure  is  given  in 
Table  2.44 

Table  2.44  Structure  of  design  of  completely  randomized  blocks 


Factor  levels 

Blocks 

Average  values 

i 

2 

3 

1 

1 

Y„ 

Yl2 

YU 

Y,| 

Yi. 

2 

y21 

Y22 

Y23 

Y2j 

y2. 

3 

Yji 

y32 

y33 

Ybj 

y3. 

I 

Y„ 

Yi2 

Y„ 

Yil 

Y„ 

Average  values 

Y.i 

Y.2 

Y.i 

Ef 

Y„ 

Table  2.45  Analysis  of  variance  of  completely  randomized  blocks 

Source  of  f Sum  of  squares  definitions  Sum  of  squares  Mean  square  Test 

variations  practical  calculations  statistic 
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The  experiment  is  done  by  the  design  shown  in  Table  2.44,  and  the  obtained 
results  processed  by  analysis  of  variance.  Definitions  with  calculation  forms  of  anal- 
ysis of  variance  are  shown  in  Table  2.45. 

Although  block  effects  are  less  interesting  for  the  research  objective,  the  large  var- 
iance value  between  the  blocks  means  that  division  into  blocks  was  justified,  i.e.  that 
residual  variance  has  been  reduced,  which  contributed  to  a higher  precision  of  the 
experiment.  A high  variance  value  between  the  blocks  means  at  the  same  time  that 
one  should  find  out  the  causes  for  such  an  expressed  inequality  of  experiment  con- 
ditions. It  should  be  noted  that  analysis  of  variance  in  Table  2.45  comprises  perform- 
ing the  experiment  by  design  of  randomized  blocks  without  repeating  the  measure- 
ments. 

The  basic  assumption  of  this  kind  of  design  of  experiments  without  repeating 
measurements  is  that  each  measured  result  may  be  described  by  this  model. 

Yij  ~ l1  + ai  + P;  + £,j  (2.40) 

i=l,  2,...,  I;  j=l,  2,...,  J 

where: 

p is  actual  mean  effect; 

|3j  is  actual  block  effect; 
ai  is  actual  factor  effect; 

£ij  is  experimental  error  with  normal  distribution  N(0,o2). 

Experiments  may  be  done  by  design  of  completely  randomized  blocks  and  by 
repeating  measurements  in  which  case  analysis  of  variance  has  a different  form 
(Table  2.46): 

It  should  be  noted  that  the  number  of  measurement  replications  in  the  matrix  of 
design  of  completely  randomized  blocks  is  marked  by  K.  A distinction  should  also 
be  made  between  mean  squares  for  measurement  error  + experimental  error  and 
measurement  error.  Often  this  sum  of  measurement  and  experimental  errors  is  just 
called  experimental  error,  and  measurement  error  sampling  error.  To  check  signifi- 
cance of  the  factor  effect,  the  mean  square  of  joint  error  or  experimental  error  MSCr 
is  used. 

Example  2.15  [21] 

Studies  of  chlorosulphonizing  of  acetaniline  showed  that  the  obtained  yield  of 
chemical  reaction  was  considerably  below  theoretical  due  to  losses  in  the  mother  liq- 
uid. Research  of  the  effects  of  five  possible  mixtures  of  acetaniline  on  yield  was  sug- 
gested. These  five  mixtures  were  marked  as  A,  B,  C,  D and  E.  Experiments  lasted  for 
15  days,  and  since  production  of  one  batch  of  acetaniline  lasts  24h,  the  whole  experi- 
mental program  was  divided  into  three  blocks  of  five  batches  each.  The  outcomes 
are  shown  in  Table  2.47.  The  results  were  processed  by  analysis  of  variance  and 
shown  in  Table  2.48.  These  results  at  95%  confidence  coefficient  were  neither  con- 
siderable between  blocks  nor  was  the  effect  of  different  acetaniline  mixtures  signifi- 
cant. However,  at  90%  confidence  F4ig.o.9o=2.81,  so  that  the  effect  of  different  acetani- 
line mixtures  was  statistically  significant.  Such  a high  level  of  risk  in  passing  a deci- 
sion demands  further  research  to  select  the  best  option  of  technological  procedure 


Source  of  f Sum  of  squares  definitions  Sum  of  squares  practical  calcula-  Mean  square  Test 

variations  tions  statistic 
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for  acetaniline  production.  Analysis  of  variance  showed  that  there  were  no  signifi- 
cant differences  between  blocks,  i.e.  that  no  trend  of  changes  in  levels  of  obtained 
results  was  noticed. 


Table  2.47  Design  of  experiment 


Block 

Batch 

Acetaniline  mixture 

Percentage  of  loss 

I 

1 

B 

18.2 

2 

A 

16.9 

3 

C 

17.0 

4 

E 

18.3 

5 

D 

15.1 

II 

6 

A 

16.5 

7 

E 

18.3 

8 

B 

19.2 

9 

C 

18.1 

10 

D 

16.0 

III 

11 

B 

17.1 

12 

D 

17.8 

13 

C 

17.3 

14 

E 

19.8 

15 

A 

17.5 

Table  2.48  Analysis  of  variance 

Source  of  variation 

f 

SS 

MS 

F 

F 2;8;0.95 

Blocks 

2 

1.65 

0.82 

0.94 

4.46 

Factor  levels 

4 

11.56 

2.89 

3.32 

3.84 

Error 

8 

6.99 

0.87 

- 

- 

Total 

14 

20.20 

- 

- 

- 

Example  2.16  [22] 

In  an  experiment  designed  as  completely  randomized  blocks,  the  effect  of  Co%  on 
steel  tensile  strength  was  researched.  Three  vessels  for  producing  alloys  were  used 
in  experimental  procedure.  Each  measurement  of  tensile  strength  was  repeated  and 
outcomes  are  shown  in  thousands  of  PSI-a  in  Table  2.49. 

Table  2.49  Completely  randomized  blocks  with  measurements  replications 


Vessel-block  l%Co  2%  Co  3%  Co  4%  Co 


1 

49 

50 

60 

62 

64 

67 

71 

75 

2 

44 

45 

53 

56 

63 

65 

65 

67 

3 

53 

56 

64 

65 

74 

78 

76 

80 
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The  results  of  analysis  of  variance  are  shown  in  Table  2.50. 

Table  2.50  Analysis  of  variance 


Sources  of  variations 

f 

SS 

MS 

F 

Ffj6;0.95 

Blocks 

2 

485.3 

242.65 

42.80 

5.14 

Factor  levels 

3 

1847.5 

615.83 

108.61 

4.76 

Experimental  error 

6 

34.0 

5.67 

- 

- 

Sampling  error 

12 

45.0 

3.75 

- 

- 

Total 

23 

2411.8 

- 

- 

- 

Problem  2.7  [22] 

In  the  previous  experiment,  where  the  effect  of  cobalt  contents  on 
steel  tensile  strength  was  researched,  we  are  now  trying  to  find  out 
the  effect  of  three  percentile  contents  of  cobalt  on  this  response  if 
each  alloy  was  tempered  in  each  of  four  furnaces.  Twelve  alloy 
batches  were  divided  into  three  samples  each,  so  that  the  tensile 
strength  for  36  samples  were  measured.  These  values  in  thousands 
of  PSI  for  measurements  or  determinations  are  shown  in  Table 
2.51.  Do  the  analysis  of  variance. 

Table  2.51  Completely  randomized  blocks  with  measurement  replications 


OK 

1%  Co 

2%  Co 

3%  Co 

1 

50 

47 

55 

64 

59 

55 

60 

66 

69 

2 

43 

42 

51 

48 

55 

62 

59 

60 

70 

3 

49 

53 

57 

66 

63 

65 

67 

83 

75 

4 

45 

49 

55 

67 

63 

60 

63 

72 

70 

Problem  2.8 

Process  the  results  of  previous  problem  by  analysis  of  variance 
assuming  that  an  experiment  by  design  of  completely  randomized 
blocks  was  done  with  no  measurement  replications.  Use  the  means 
of  replicated  measurements  for  such  an  analysis,  as  shown  in 
Table  2.52 

Compare  the  results  of  analysis  from  the  previous  and  from  this 
problem. 


Table  2.52  Completely  randomized  blocks  with  no  replications 


Furnace  block 

1%  Co 

2%  Co 

3%  Co 

1 

50.67 

59.33 

65.00 

2 

45.33 

55.00 

63.00 

3 

53.00 

64.67 

75.00 

4 

49.67 

63.33 

68.33 
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2. 2. 3.1  Incomplete  Random  Block  Design 

Former  analysis  of  designs  of  completely  randomized  blocks  included  designs  for 
all  combinations  of  factor  and  block  variation  levels.  In  doing  these  designs  in  prac- 
tice certain  results  may  be  lacking  due  to  the  impossibility  of  measurement,  lack  of 
material,  error  in  measurement,  etc.  According  to  the  number  and  position  of  the 
missing  data  one  can  distinguish  balanced  and  unbalanced  incomplete  random 
blocks.  Since  balanced  are  a special  case  of  unbalanced  incomplete  random  blocks, 
only  unbalanced  designs  will  be  analyzed  [23].  Let  us  analyze  (IxJ)-dimensional 
order  of  experimental  results  with  K missing  values,  Table  2.53. 

Table  2.53  I incomplete  random  blocks 


Factor  levels 

Block 

1 

2 

3 

) 

1 

Y 

- 

Y 

... 

2 

- 

Y 

Y 

Y 

3 

Y 

Y 

- 

Y 

I 

Y 

Y 

- 

Y 

Estimate  of  values  is  done  in  order  to  complete  the  analysis  of  variance  even 
when  some  values  are  lacking.  There  are  two  methods  for  this  estimate: 

• direct  method,  when  fewer  data  than  the  sum  of  lines  and  columns  are  miss- 
ing; 

• constant  method,  when  more  data  than  the  sum  of  lines  and  columns  are 
missing. 

Direct  methods 

Missing  data  Z;j  are  determined  from  the  condition  that  residual  sum  of  squares 
SSE  is  minimal. 

SSs=SSrSSc-SSR  (2.41) 

where: 

SST  are  sums  of  squares  of  deviations  of  individual  results  from  the  mean  of  all 
results; 

SSC  are  sums  of  squares  of  deviations  of  block  centers  from  the  mean  of  all  results; 
SSR  are  sums  of  squares  of  deviations  of  factor  centers  from  the  mean  of  all  results. 
If  we  now  mark  the  sum  of  all  individual  values  by  S 


s = EE(vzs) 

i j 

the  sums  of  squares  may  be  written  as: 
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SST=££Y|+E£Z  -77 


IJ 


££; 


SSr  = 


£^ 


SSK  =■ 


J 


£ 

7/ 

17 


After  replacing  relations  (2.4-2)— (2.4-4)  in  Eq.  (2.41)  we  get: 


sn-  E*£ 


i 


s 

7 h77 


(2.42) 

(2.43) 

(2.44) 

(2.45) 


= £ £ £ + £ £ zy  - 

i j i j 

By  partial  differentiation  of  residual  sum  of  squares  by  Z;j  and  bringing  it  down 
to  zero,  we  obtain  normal  equations.  By  solving  the  simultaneous  equations  we 
determine  individual  values  of  results  that  are  not  available. 


Example  2.17 

When  testing  durability  or  wear-out  of  car  tires,  the  effect  of  four  kinds  of  plastici- 
zers was  researched.  Due  to  problems  in  doing  the  experiment,  the  following  out- 
comes were  obtained: 


Table  2.54  I incomplete  random  blocks 


Plasticizer 

Type 

Car  tire  block 

1 

2 

3 

A 

238 

196 

254 

B 

238 

213 

- 

C 

279 

- 

334 

D 

- 

308 

367 

The  values  Z41;  Z32  and  Z23  are  evidently  missing,  so  that  we  get  Table  2.54: 


Table  2.55  I incomplete  random  blocks 


Plasticizer 

type 

Car  tire 

Sum 

i 

2 

3 

A 

238 

196 

254 

688 

B 

238 

213 

z23 

451+Z23 

C 

279 

Z32 

334 

613+Z32 

D 

Z41 

308 

367 

675+Z41 

Sum 

755+Z41 

717+Z32 

955+Z23 

2427+Z23+Z32+Z41 
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The  residual  sum  of  squares  is: 

ssE  = Z41  + Z32  + Z23  + E E yfj 

(755+Z41)2  (717 +Z32)2 

4 4 


(451+Z,,)2  (613+Z„)2  (675+Z41)2 

3 3 3 

(955+Z23)2  (2427+Z23+Z32+Z41)2 

4 12 


By  partial  differentiation  SSE/Z23;  SSE/Z32;  SSE/Z41  and  bringing  it  down  to  zero 
we  have: 

( 6Z41  + Z32  + Z23  = 2538 

^ Z41  + 6Z32  + Z23  = 2176  (2.46) 

^ Z41  - 1 Z32  I 6Z23  = 22  42 

By  calculating  the  system  of  linear  equations  we  get: 

Z41=333.7;  Z32=261.3;  Z23=274.5. 


Constant  method 

Let  us  mark  block  effects  by  bE,  b2,  bj,  factor  effects  by  a1;  a2,  aj  , and  by  m the 
mean  of  all  results.  By  definition  E = E fl;  = 0,  and  the  number  of  independent 
constants  for  estimates  is  equal  to:  number  of  rows  + number  of  columns  -1.  Let  Y;j 
denote  the  response  value  in  row  i and  column  j.  The  expected  value  in  i rows  and  j 
columns  based  on  columns  effects  and  rows  effects  is  a;  +b,  +m,  while  the  deviation 
of  experimental  values  Y4  from  it  is: 

Yij-(ai+bj+m)  (2.47) 

The  values  that  are  not  available  are  estimated  from  the  condition  that  the  devia- 
tion sum  of  squares  is  minimal. 

S=E  [Yy-(«1  + k/  + ™)]2  (2-48) 

Values  a;,  bj  and  m are  obtained  from  this  system  of  normal  equations: 

= )-o 

‘'-'H 


< 


(2.49) 
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Example  2.18 

Expected  values  of  experimental  results  are  shown  in  Table  2.56  (Example  2.17). 

Deviation  sum  of  squares  is  obtained  by  squaring  the  difference  of  expected  val- 
ues (Table  2.56)  and  experimentally  obtained  results  (Table  2.54).  By  partial  differen- 
tiation of  a;,  bj  and  m and  bringing  it  down  to  zero,  we  get  the  following  system  of 
normal  equations: 

Table  2.56  I incomplete  random  blocks 


Factor  levels 

Block 

1 

2 

3 

1 

m+ax+bi 

m+a1+b2 

m+aj+b3 

2 

m+a2+bi 

m+a2+b2 

- 

3 

m+a3+b! 

- 

m+a3+b3 

4 

- 

m+a4+b2 

m+a4+b3 

' 2427  — 9m  — 3 (b1  + b2  + b3 ) — 3 ax  — 2 (a2  + a3  + a4)  = 0 
688  — 3m  — 3ax  — (bt  + b2  + b3)  = 0 
451  — 2m  — la2  — (hx  + b2)  = 0 

613  -2m-2a3  - (b3  + b3)  = 0 (2.50) 

675  — 2m  — 2a4  — (b2  + b3)  = 0 

755  — 3m  — 3foj  — (ax  + a2  + a3)  = 0 

717  — 3m  — 31>2  — (ax  + a2  + a4)  = 0 

_ 955  — 3m  — 3 b3  — (ax  + a3  + a4)  = 0 


Calculations  of  simultaneous  system  of  equations  are: 

m=274.708;  bx=-2.533;  b2=-30.133;  b3=32.667;  ax=-45.375;  a2=-32.875;  a3=16.725; 

a4=61.525 

The  values  that  are  not  available  are  determined  from  obtained  constants  through 
Equ.  (2.47): 

Yi^m+aj+bj 

For  example  for  Z41  we  get  the  same  as  for  the  direct  method: 

Z41=m+a4+bx=274. 708+61.525-2.533=333.7 

Although  this  direct  method  is  more  adequate  for  the  given  example,  because  the 
number  of  the  values  that  are  not  available  are  smaller  than  the  sum  of  rows  and 
columns,  the  constant  method  has  also  been  demonstrated  for  the  case  of  compari- 
son. It  should  be  noted  that  both  methods  are  generally  used  in  two-way  classifica- 
tion such  as  designs  of  completely  randomized  blocks,  Latin  squares,  factorial 
experiments,  etc.  Once  the  values  that  are  not  available  are  estimated,  the  averages 
of  individual  blocks  and  factor  levels  are  calculated  and  calculations  by  analysis  of 
variance  done.  The  degree  of  freedom  is  thereby  counted  only  with  respect  to  the 
number  of  experimental  values.  Results  of  analysis  of  variance  for  this  example  are 
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shown  in  Table  2.57.  The  total  number  of  degrees  of  freedom  is  8 and  not  3x4-1=11, 
since  three  values  have  been  estimated.  Residual  variance  in  incomplete  random  blocks 
therefore  has  a higher  value  than  in  complete  ones,  so  the  F-test  is  less  sensitive. 

Table  2.57  Analysis  of  variance  of  incomplete  random  blocks 


Sources  of  variation 

f 

SS 

MS 

F 

Ff;3;0.95 

Blocks 

2 

4933.87 

2466.94 

17.36 

9.55 

Factor  levels 

3 

14733.03 

4911.01 

36.56 

9.28 

Error 

3 

426.30 

142.10 

- 

- 

Total 

8 

20093.20 

- 

- 

- 

It  is  clear  from  the  table  of  analysis  of  variance  that  the  factor  effect  is  statistically 
highly  significant.  The  effect  of  blocks  is  also  important,  which  justifies  the  division 
of  experimental  conditions  into  blocks. 

2.2.4 

Latin  Squares 

Designs  of  experiments  that  are  specially  useful  in  research,  development  and  opti- 
mization in  the  phase  of  screening  factors  are  called  Latin  squares.  As  for  rando- 
mized blocks  that  are  used  to  eliminate  one  cause  of  inequality  (nonhomogeneity) 
of  a research  subject,  Latin  squares  are  applied  to  distinguish  two  causes  of  inequal- 
ity of  a research.  Inequality  of  experimental  conditions  is  reduced  even  more  by 
applying  Latin  squares,  which  facilitates  a more  precise  analysis  of  the  effect  of 
researched  factor.  Since  Latin  squares  are  primarily  used  in  single-factorial  experi- 
ments for  researching  the  effect  of  one  factor  when  double  inequality  or  double  divi- 
sion into  randomized  blocks  of  a research  subject  is  present,  one  may  say  that  Latin 
squares  are  an  expansion  of  designs  of  completely  randomized  blocks  or  1 /m  replica 
of  type  m3  full  factorial  experiment. 

Example  2.19 

Consider  an  experiment  where  durability  or  wear-out  of  four  types  of  car  tires  have 
to  be  researched.  Sixteen  tires  are  at  our  disposal,  four  of  each  type.  The  research 
will  be  done  on  four  cars.  The  factor  in  this  case  is  the  type  of  car  tire.  There  are, 
however,  two  additional  factors  that  affect  the  durability  of  tires: 

• car  type, 

• tire  position  on  car. 

Let  us  mark  car  type  as  I,  II,  III  and  IV  and  the  position  of  tires  on  each  car  as  FR, 
FL,  RL  and  RR.  The  latest  two  factors  are  singled  out  as  an  inequality  of  experimen- 
tal conditions  into  blocks  by  rows  and  columns.  Car  tire  types  are  placed  by  random 
choice  on  cars  but  so  that  one  type  of  tire  is  put  on  one  type  of  car  only  once  in  the 
same  position. 
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Design  of  Latin  squares  is  frequently  applied  when  the  effect  of  one  factor  on  sev- 
eral conditionally  the  same  devices  is  researched  for  a long  time.  In  that  case,  rows 
of  designs  correspond  to  successive  time  studies,  and  columns  to  experimental 
devices. 

Experimental  designs  are  square  in  forms  (mxm),  and  the  researched  factor  is 
tested  once  in  each  step.  Table  2.58  shows  an  example  of  4x4  Latin  square  design. 

Table  2.58  Design  of  experiment  of  latin  square 


Rows 

Columns 

1 

2 

3 

4 

1 

A 

B 

C 

D 

2 

B 

C 

D 

A 

3 

C 

D 

A 

B 

4 

D 

A 

B 

C 

It  is  clear  from  the  table  that  A,  B,  C and  D are  levels  of  the  researched  factor.  An 
important  condition  for  applying  design  of  Latin  squares  is  that  in  each  column  and 
each  row  one  factor  level  may  appear  once  and  only  once. 

Analysis  of  variance  of  latin  squares 

The  model  for  a general  mxm  Latin  square  with  one  observation  in  the  cell  is: 

Yij(k)  = h + a,  + |3.  + xk  + eijm 

= 1.2, ...,  m;j  = 1.2, ...,  m;k  = 1.2, ...,  m.  (2-51) 

where: 

E «.  = E Pj  = E T = 0 

£ij(k)  has  normal  distributions  N(0,  a2). 

Variables  a;,  |3j  and  tk  are  actual  effects  of  i rows,  j columns  and  the  k factor  level. 
One  can  notice  that  the  k index  is  bracketed  to  indicate  that  in  the  design  of  Latin 
squares  there  are  no  m results,  as  is  the  case  with  a three-factorial  design  with  one 
design-point  replication.  Design  of  Latin  squares  actually  has  m2  observations  or 
data. 

Analysis  of  variance  for  an  m x m Latin  square  with  one  observation  per  cell  in 
concordance  with  model  (2.51)  is  shown  in  Table  2.59.  Associated  sums  of  data  per 
rows,  columns  and  factor  are  marked  , Y.J(V)  and  ■ 

Table  2.59  shows  that  total  variance  of  experimental  results  is  divided  into  var- 
iances of: 

• R row-variation  sources; 

• C column-variation  sources; 

• T factor-variation  sources; 

• residual  or  experimental  error. 
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The  condition  for  application  of  Latin  squares  are  interactions  that  are  negligible 
with  respect  to  experimental  error.  As  in  researching  complex  system  there  exist  in- 
teractions, Latin  square  designs  are  not  widely  applied. 

Table  2.59  Analysis  of  variance  of  m x m latin  square 


Source  of 
variation 

f 

SS 

MS 

Rows 

m-l 

SSR 

Y2 

*••(•) 

msr 

ssR 

m 

2 

m 

m—1 

Columns 

m-1 

ssc 

£ y*j« 
_ j 

Y2 

*••(•) 

MSC 

ssc 

m 

1 

m 

m—1 

Factor 

m-l 

SST 

£ Y*»(k) 
k 

Y2 

*••(•) 

mst 

ssT 

m 

2 

m 

m—1 

Residual  (m-l)(m-2)  SSE  V V V Ygm — SSR  — SSC  — SST 

i j k m 


mse  = ^ 

(m— l)(m— 2) 


Total  m2 


eee4-^ 

ijk  m 


Example  2.20  [15] 

As  already  defined  in  Example  2.19,  four  types  of  tires  A,  B,  C and  D have  been 
researched  by  design  of  Latin  squares  in  this  way:  sixteen  tires,  four  of  each  type, 
have  been  put  on  four  cars  (I,  II,  III  and  IV)  under  these  conditions: 

• each  car  has  one  tire  of  each  of  the  four  types; 

• each  type  of  tire  is  in  positions  FL,  FR,  RL  and  RR; 

• actual  position  of  each  of  16  tires  has  been  randomly  chosen  for  4 x 4 Latin 
square. 

Wear-out  of  tires  has  been  measured  after  8000  km  by  a standard  procedure.  The 
obtained  results  are  shown  in  Table  2.60. 

Table  2.60  Design  of  experiment-latin  square 


Car 

Position 

YM.) 

FR 

FL 

RR 

RL 

I 

A 31 

B 33 

C 47 

D 54 

165 

II 

B 36 

D 53 

A 42 

C 54 

185 

III 

C 51 

A 43 

D 62 

B 49 

205 

IV 

D 81 

C 78 

B 72 

A 84 

315 

Y,e> 

199 

207 

223 

241 

870 

After  calculation  we  obtain: 


E y..(.)  _ 8702 


16  16 


= 47306.25; 


E E 

j *J(,)  1992  +2072  +2232  +2412  190260 


v Y2 

i “(*)  1652+1852+2052+3152  202700 


= 47565; 


= 50675; 


J2  Y 

k *’ {k)  _ (31+43+42+84)2+...+(81+53+62+54)2 
4 “ 4 


2002+1902+2302+2502 


= 47875. 


So  that: 


E Y;.(.)  y2 


SSB  = 


SSr  = 


SST  = 


•w  _ 


4 

16 

j 

Y2 

r..(.) 

4 

16 

E E.(fc) 

k 

Y2 

r..(.) 

4 

16 

= 3368.75; 


= 258.75; 


= 568.75; 


ssB  = E E E ^ - SSR  - SSC  - SST 

= 51540.00-47306.25-3368.745-258.75-568.75=37.50. 

Table  2.61  Analysis  of  variance 


Sources  of  variation 

f 

SS 

MS 

F 

Ff;6;0.95 

Rows  cars 

3 

3368.75 

1122.917 

179.66 

4.76 

Column  positions 

3 

258.75 

86.250 

13.80 

4.76 

Factor  tire  type 

3 

568.75 

189.583 

30.33 

4.76 

Experimental  error 

6 

37.50 

6.250 

- 

- 

Total 

15 

4233.75 

- 

- 

- 

Statistically  significant  differences  between  car  tires,  tire  positions  and  tire  types 
are  clearly  evident  from  analysis  of  variance.  Table  2.60  indicates  the  highest  wear 
out  of  tires  on  car  IV  and  the  lowest  on  car  I.  This  does  not  present  an  error  but  is 
simply  the  upper  and  lower  limit  of  tire  wear-out  in  this  research. 
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Example  2.21  [24] 

In  finding  the  composition  for  neutral  light  filters  based  on  poly  methyl  methacry- 
late (PMMA)  it  was  necessary  to  research  the  effect  of  four  different  dyestuffs  (X3), 
which  are  prepared  in  four  different  ways  (X2)  and  in  four  concentrations  (X3)  as  for 
the  filter  optical  properties.  The  experiment  was  done  by  a 4 x 4 design  of  Latin 
square.  Samples  of  obtained  PMMA  were  tested  for  light  permeability  within  ranges 
of  these  spectra:  240-300(Yj);  300-340(Y2);  300-1000(Y3)  and  for  general  light  per- 
meability (Y4).  GOSTs  demand  for  light  filters  is  that  Y3  does  not  exceed  0.5%;  Y2- 
20%;  Y3-45%,  and  general  permeability  to  be  between  20%  and  26%.  Since  no  ana- 
lyzed response  has  a universal  property,  overall  desirability  D should  be  taken  as  a 
representative  estimate  of  the  quality  of  light  filter. 

D = yj dx  x d2  x d3  x d4  (2.52) 

where 

d;  are  partial  desirabilities. 

Desirability  scale  for  individual  properties  is  shown  in  Figs.  2.26  and  2.27: 


d 
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Figure  2.27  Desirability  scale 


The  results  of  design  and  performance  of  experiment  by  4x4  Latin  square  are 
shown  in  Table  2.62. 


Table  2.62  Design  of  experiment  by  4x4  latin  square 
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Levels  of 
variation 

Factors 

Responses 

X, 

X2 

x3 

Y, 

y2 

y3 

y4 

D 

0 

0.006 

1 

A 

1 

0.008 

2 

B 

2 

0.010 

3 

C 

3 

0.020 

4 

E 

1 

0 

0 

A 

0.1 

10.9 

35.8 

20.5 

0.73 

2 

0 

1 

B 

0.0 

46.1 

70.1 

69.5 

0 

3 

0 

2 

C 

0.1 

19.8 

72.8 

39.5 

0 

4 

0 

3 

E 

0.2 

53.0 

69.2 

66.3 

0 

5 

1 

0 

B 

0.0 

44.0 

63.8 

61.5 

0 

6 

1 

1 

C 

0.0 

24.0 

72.8 

40.9 

0 

7 

1 

2 

E 

0.0 

22.5 

65.3 

48.8 

0 

8 

1 

3 

A 

0.0 

5.0 

20.1 

22.1 

0.83 

9 

2 

0 

C 

0.0 

21.0 

70.3 

34.0 

0 

10 

2 

1 

E 

0.1 

32.8 

53.8 

51.0 

0 

11 

2 

2 

A 

0.0 

3.0 

29.8 

12.8 

0.50 

12 

2 

3 

B 

0.0 

16.5 

50.8 

35.0 

0.34 

13 

3 

0 

E 

0.0 

10.0 

37.0 

31.9 

0.66 

14 

3 

1 

A 

0.1 

0.5 

9.0 

2.3 

0 

15 

3 

2 

B 

0.1 

18.0 

36.0 

26.3 

0.67 

16 

3 

3 

C 

0.0 

7.0 

70.8 

25.8 

0 

The  results  of  analysis  of  variance  are  shown  in  Table  2.63. 

Table  2.63  Analysis  of  variance 


Sources  of  variations 

f 

SS 

MS 

F 

Ff;6;0.95 

Factor  xl 

3 

0.333042 

0.111014 

1.77 

4.76 

Factor  x2 

3 

0.069641 

0.023214 

0.37 

4.76 

Factor  x3 

3 

1.573387 

0.524462 

8.39 

4.76 

Experimental  error 

6 

0.374770 

0.062462 

- 

- 

Total 

15 

2.350840 

- 

- 

- 

Analysis  of  variance  shows  that  only  factor  effect  X3  is  significant.  This  means 
that  out  of  four  different  dyestuffs  one  should  choose  the  most  convenient  one. 
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Problem  2.9  [21] 

The  procedure  of  spraying  an  insecticide  over  a treated  area  is  very 
important  for  its  efficient  use.  A study  has  therefore  been  done  with 
the  idea  of  establishing  the  best  procedure  for  preparing  a spraying 
insecticide.  The  following  three  factors  have  been  researched: 


1 . type  of  component  mixing; 

2.  form  of  active  matter; 

3.  solvent  type. 


Under  the  assumption  that  there  is  no  significant  interaction 
between  the  given  factors,  the  design  of  Latin  squares  has  been 
applied.  In  this  case,  seven  solvent  types,  methods  of  mixing  and 
forms  of  active  matter  have  been  researched.  Stability  of  insecticide 
is  measured  as  the  response.  The  results  are  shown  in  Table  2.64. 
Do  the  analysis  of  variance. 


Table  2.64  Design  matrix  of  7 x 7 latin  square 


Type  of  mixing 

Form  of  active  matter 

Sum 

i 

2 

3 

4 

5 

6 

7 

1 

A 98 

B 117 

C 89 

D 64 

E 63 

F 123 

G 244 

807 

2 

B 69 

E 67 

A 70 

G 70 

F 111 

D 60 

C 218 

665 

3 

C 37 

F 83 

G 83 

B 74 

D 70 

A 75 

E 169 

591 

4 

D 65 

G 60 

E 91 

F 56 

C 61 

B 59 

A 150 

542 

5 

E 56 

D 44 

B 70 

C 68 

A 88 

G 111 

F 220 

657 

6 

F 113 

C 105 

D 65 

A 51 

G 83 

E 57 

B 233 

707 

7 

G 64 

A 62 

F 65 

E 86 

B 45 

C 108 

D 187 

617 

Sum 

502 

538 

533 

469 

521 

602 

1421 

4586 

Problem  2.10  [15] 

The  design  of  a 4 x 4 Latin  square  has  been  used  in  researching 
effects  of  water  pressure,  air  flow  and  number  of  nozzles  in  opera- 
tion on  scrubber  efficiency.  Research  outcomes  are  shown  in 
Table  2.65.  Do  the  analysis  of  variance  for  the  given  data. 


Table  2.65  Design  matrix  of  4 x 4 latin  square 


Nozzles  in 
operation 

Airflow 

Sum 

5120 

2560 

4160 

6400 

Na 

90 

81 

95 

95 

361 

Nc 

- 

80 

88 

85 

253 

ND 

83 

83 

88 

67 

321 

nb 

96 

95 

88 

94 

373 

Sum 

269 

339 

359 

341 

1308 
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Problem  2.11  [25] 

An  alternator  for  a missile  has  been  constructed  and  made.  The 
alternator  is  activated  by  a turbine,  and  the  turbine  by  the  products 
of  burning  of  the  generator  gas  from  solid  rocket  propellant.  The 
alternator  is  composed  of  three  separate  sections  that  generate 
energy,  each  of  them  being  independent.  One  section  provides 
energy  for  keeping  the  alternator  velocity  at  24000  min’1.  It  is  com- 
posed of  a 4-pole  stator,  6-pole  rotor  and  axis.  The  rotor  rotates  con- 
centrically in  the  stator  opening,  while  the  stator  is  fixed  in  its  box. 

The  stator  consists  of  coils  for  direct  and  alternating  currents.  The 
exit  voltage  of  the  alternating  current  is  a function  of  the  inlet  direct 
current  and  the  number  of  alternating  coils.  The  rotor  is  coiled  by 
laminates  0.004  inches  thick.  The  laminates  are  covered  up  for  insu- 
lation. The  experiment  is  aimed  at  finding  factors  and  their  levels 
that  affect  the  alternator  performance  most  significantlly.  The 
design  of  experiment  was  a 55  Latin  square  with  these  factors  and 
levels: 

• number  of  coils  for  alternating  current  of  stator  are  these:  145;  150;  155;  160 
and  165; 

• number  of  rotor  laminates:  230;  240;  250;  260  and  270; 

• visual  quality  of  laminate  insulation:  A;  B;  C;  D and  E,  where  A is  the  best 
and  E the  worst  quality. 

The  result  of  measurement  is  the  maximal  parasitic  voltage  of 
alternating  current.  All  the  measured  values  have  been  reduced  by 
300  to  make  calculation  easier  and  are  shown  in  Table  2.66.  Do  the 
analysis  of  variance. 


Table  2.66  Design  of  experiment  of  5 x 5 latin  square 


Rotors 

Stators 

Sum 

145 

150 

155 

166 

165 

230 

C 10 

B 12 

A 20 

D 6 

E0 

48 

240 

D 9 

C 10 

B 24 

E0 

A 5 

48 

250 

B 12 

E 3 

C 25 

A 7 

D 2 

49 

260 

A 16 

D 6 

E 18 

C 4 

B -6 

38 

270 

E 14 

A 8 

D 23 

B 9 

C 3 

57 

Sum 

61 

39 

110 

26 

4 

240 
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The  following  Latin  squares  may  be  used  for  practical  solving  of  the  problem: 

4x4 

1. 

2. 

3. 

4. 

A 

B 

C 

D 

A 

B 

C 

D 

A 

B 

C 

D 

A 

B 

C 

D 

B 

A 

D 

C 

B 

C 

D 

A 

B 

D 

A 

C 

B 

A 

D 

C 

C 

D 

B 

A 

C 

D 

A 

B 

C 

A 

D 

B 

C 

D 

A 

B 

D 

C 

A 

B 

D 

A 

B 

C 

D 

C 

B 

A 

D 

C 

B 

A 

5x5 

6x6 

7x7 

A 

B 

C 

D 

E 

A 

B 

C 

D 

E 

F 

A 

B 

C 

D 

E 

F 

G 

B 

A 

E 

C 

D 

B 

F 

D 

C 

A 

E 

B 

C 

D 

E 

F 

G 

A 

C 

D 

A 

E 

B 

C 

D 

E 

F 

B 

A 

C 

D 

E 

F 

G 

A 

B 

D 

E 

B 

A 

C 

D 

A 

F 

E 

C 

B 

D 

E 

F 

G 

A 

B 

C 

E 

C 

D 

B 

A 

E 

C 

A 

B 

F 

D 

E 

F 

G 

A 

B 

C 

D 

F 

E 

B 

A 

D 

C 

F 

G 

A 

B 

C 

D 

E 

G 

A 

B 

C 

D 

E 

F 

8x8 

9x9 

A 

B 

C 

D 

E 

F 

G 

H 

A 

B 

C 

D 

E 

F 

G 

H 

I 

B 

C 

D 

E 

F 

G 

H 

A 

B 

C 

D 

E 

F 

G 

H 

I 

A 

C 

D 

E 

F 

G 

H 

A 

B 

C 

D 

E 

F 

G 

H 

I 

A 

B 

D 

E 

F 

G 

H 

A 

B 

C 

D 

E 

F 

G 

H 

I 

A 

B 

C 

E 

F 

G 

H 

A 

B 

C 

D 

E 

F 

G 

H 

I 

A 

B 

C 

D 

F 

G 

H 

A 

B 

C 

D 

E 

F 

G 

H 

I 

A 

B 

C 

D 

E 

G 

H 

A 

B 

C 

D 

E 

F 

G 

H 

I 

A 

B 

C 

D 

E 

F 

H 

A 

B 

C 

D 

E 

F 

G 

H 

I 

A 

B 

C 

D 

E 

F 

G 

I 

A 

B 

C 

D 

E 

F 

G 

H 

10  x 10 

A 

B 

C 

D 

E 

F 

G 

H 

I 

1 

B 

C 

D 

E 

F 

G 

H 

I 

1 

A 

C 

D 

E 

F 

G 

H 

I 

1 

A 

B 

D 

E 

F 

G 

H 

I 

1 

A 

B 

C 

E 

F 

G 

H 

I 

I 

A 

B 

C 

D 

F 

G 

H 

I 

J 

A 

B 

C 

D 

E 

G 

H 

I 

I 

A 

B 

C 

D 

E 

F 

H 

I 

J 

A 

B 

C 

D 

E 

F 

G 

I 

J 

A 

B 

C 

D 

E 

F 

G 

H 

I 

A 

B 

C 

D 

E 

F 

G 

H 

I 
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11x11 

12x12 

A 

B 

C 

D 

E 

F 

G 

H 

I 

J 

I< 

A 

B 

C 

D 

E 

F 

G 

H 

I 

I 

K 

L 

B 

C 

D 

E 

F 

G 

H 

I 

I 

K 

A 

B 

C 

D 

E 

F 

G 

H 

I 

J 

K 

L 

A 

C 

D 

E 

F 

G 

H 

I 

J 

K 

A 

B 

C 

D 

E 

F 

G 

H 

I 

J 

I< 

L 

A 

B 

D 

E 

F 

G 

H 

I 

J 

I< 

A 

B 

C 

D 

E 

F 

G 

H 

I 

I 

K 

L 

A 

B 

C 

E 

F 

G 

H 

I 

I 

K 

A 

B 

C 

D 

E 

F 

G 

H 

I 

I 

K 

L 

A 

B 

C 

D 

F 

G 

H 

I 

J 

K 

A 

B 

C 

D 

E 

F 

G 

H 

I 

J 

I< 

L 

A 

B 

C 

D 

E 

G 

H 

I 

J 

I< 

A 

B 

C 

D 

E 

F 

G 

H 

I 

I 

K 

L 

A 

B 

C 

D 

E 

F 

H 

I 

I 

IC 

A 

B 

C 

D 

E 

F 

G 

H 

I 

I 

K 

L 

A 

B 

C 

D 

E 

F 

G 

I 

I 

K 

A 

B 

C 

D 

E 

F 

G 

H 

I 

J 

K 

L 

A 

B 

C 

D 

E 

F 

G 

H 

J 

K 

A 

B 

C 

D 

E 

F 

G 

H 

I 

I 

I< 

L 

A 

B 

C 

D 

E 

F 

G 

H 

I 

K 

A 

B 

C 

D 

E 

F 

G 

H 

I 

J 

K 

L 

A 

B 

C 

D 

E 

F 

G 

H 

I 

I 

L 

A 

B 

C 

D 

E 

F 

G 

H 

I 

I 

I< 

2.2.5 

Graeco-Latin  Square 

By  applying  Latin  squares  one  could  do  studies  with  three  independent  factors  si- 
multaneously or  one  could  analyze  the  effect  of  one  factor  while  singling  out  two 
inequalities  of  a research  subject,  such  as  for  instance,  more  experimental  devices 
and  different  times  of  doing  the  experiment.  The  concept  of  Latin  squares  may  easi- 
ly he  extended  to  Graeco-Latin  squares.  This  kind  of  design  allows  the  same  number 
of  levels  of  a factor  to  be  experimentally  researched  on  the  same  number  of  experi- 
mental devices  in  different  times  and  different  locations.  Use  of  these  designs 
results  in  exceptional  savings  of  time  and  means,  as  the  total  number  of  design 
points  has  been  drastically  reduced.  However,  similar  to  Latin  squares,  these  designs 
may  be  used  only  when  interactions  are  statistically  insignificant.  Graeco-Latin 
squares  are  constructed  so  that  one  design  of  a Latin  square  with  Latin  characters  is 
put  over  another  design  of  a Latin  square  but  with  Greek  characters,  and  in  such  a 
way  that  each  Graeco-Latin  pair  of  characters  appears  once  and  only  once.  In  such 
conditions  one  may  construct  n x n Graeco-Latin  squares  n/2,6,10.  In  constructing 
Graeco-Latin  squares,  numbers  are  frequently  used  instead  of  Greek  characters. 
Generally  speaking,  rows  represent  levels  of  one  factor,  columns  levels  of  the  second 
factor,  Greek  characters  or  numbers  levels  of  the  third  factor  and  Latin  characters 
levels  of  the  fourth  factor.  As  has  already  been  said,  Graeco-Latin  squares  may  be 
considered  as  designs  with  three  types  of  blocks.  Designs  of  Latin  squares  n x n 
require  n2  design  points  or  observations,  which  is  considerably  less  when  compared 
to  n4  design  points  in  full  factorial  experimental  design. 
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The  following  Graeco- Latin  squares  may  be  used  for  practical  needs: 


3x3 

4x4 

5x5 

A1 

B3 

C2 

A1  B3  C4  D2 

A1 

B3 

C5 

D2 

E4 

B2 

Cl 

A3 

B2  A4  D3  Cl 

B2 

C4 

D1 

E3 

A5 

C3 

A2 

B1 

C3  D1  A2  B4 

C3 

D5 

E2 

A4 

B1 

D4  C2  B1  A3 

D4 

El 

A3 

B5 

C2 

E5 

A2 

B4 

Cl 

D3 

7x7 

8x8 

A1 

B5 

C2 

D6 

E3  F7  G4 

A1 

B5 

C2 

D3 

E7 

F4 

G8 

H6 

B2 

C6 

D3 

E7 

F4  G1  A5 

B2 

A8 

G1 

F7 

H3 

D6 

C5 

E4 

C3 

D7 

E4 

FI 

G5  A2  B6 

C3 

G4 

A7 

El 

D2 

H5 

B6 

F8 

D4 

El 

F5 

G2 

A6  B3  C7 

D4 

F3 

E6 

A5 

C8 

B1 

H7 

G2 

E5 

F2 

G6 

A3 

B7  C4  D1 

E5 

HI 

D8 

C4 

A6 

G3 

F2 

B7 

F6 

G3 

A7 

B4 

Cl  D5  E2 

F6 

D7 

H4 

B8 

G5 

A2 

E3 

Cl 

G7 

A4 

B1 

C5 

D2  E6  F3 

G7 

C6 

B3 

H2 

FI 

E8 

A4 

D5 

H8 

E2 

F5 

G6 

B4 

C7 

D1 

A3 

9x9 

11x11 

A1 

B3 

C2 

D7 

E9  F8  G4  H6 

15 

A1 

B7 

C2 

D8 

E3 

F9 

G4  HIO 

15 

JH 

K6 

B2 

Cl 

A3 

E8 

F7  D9  H5  14 

G6 

B2 

C8 

D3 

E9 

F4 

GIO 

H5 

111 

J6 

K1 

A7 

C3 

A2 

B1 

F9 

D8  E7  16  G5 

H4 

C3 

D9 

E4 

FIO  G5 

Hll 

16 

n 

K7 

A2 

B8 

D4 

E6 

F5 

G1 

H3  12  A7  B9 

C8 

D4  E10 

F5  Gil  H6 

11 

17 

K2 

A8 

B3 

C9 

E5 

F4 

D6 

H2 

11  G3  B8  C7 

A9 

E5 

Fll  G6 

HI 

17 

J2 

K8 

A3 

B9 

C4 

DIO 

F6 

D5 

E4 

13 

G2  HI  C9  A8 

B7 

F6 

G1 

H7 

12 

J8 

K3 

A9 

B4 

CIO 

D5 

Ell 

G7 

H9 

18 

A4 

B6  C5  D1  E3 

F2 

G7 

H2 

18 

J3 

K9 

A4 

BIO 

C5 

Dll 

E6 

FI 

H8 

17 

G9 

B5 

C4  A6  E2  FI 

D3 

H8 

13 

J9 

K4  AlO 

B5 

Cll 

D6 

El 

F7 

G2 

19 

G8 

H7 

C6 

A5  B4  F3  D2 

El 

19 

J4  K10  A5  Bll 

C6 

D1 

E7 

F2 

G8 

H3 

J10  K5  All  B6 

Cl 

D7 

E2 

F8 

G3 

H9 

14 

Kll  A6 

B1 

C7 

D2 

E8 

F3 

G9 

H4 

110 

J5 

12x12 

A1 

B12 

C6 

D7  IS  J4  K10 

Lll 

E9 

F8 

G2 

H3 

B2 

All 

D5 

C8  J6  13  L9  1 

K12  F10 

E7 

HI 

G4 

C3 

DIO  A8 

B5 

; K7  L2  112 

J9  i 

Gil 

H6 

E4 

FI 

D4 

C9 

B7 

A6  L8  K1  Jll 

110  H12  G5 

F3 

E2 

E5 

F4 

G10H11  A9  B8  C2 

D3 

11 

J12 

K6 

L7 

F6 

E3 

H9  G12  BIO  A7  D1 

C4 

J2 

111 

L5 

K8 

G7 

H2 

E12  F9 

' Cll  D6  A4 

B1 

K3 

L10 

18 

J5 

H8 

G1 

Fll 

. E10  D12  C5  B3 

A2 

L4 

K9 

17 

16 

19 

J8 

K2 

L3 

El  F12  G6 

H7 

A5 

B4 

CIO  Dll 

J10 

17 

LI 

K4  F2  Ell  H5 

G8 

B6 

A3 

D9 

C12 

Kll 

L6 

14 

n 

G3  H10  E8 

F5 

C7 

D2 

A12 

B9 

L12 

K5 

J3 

12 

H4  G9  F7 

E6 

D8 

Cl 

Bll  A10 
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Table  2.67  Analysis  of  variance  of  n x n Graeco-Latin  square 


Source  of  variations  f SS 


Factor  I rows  n-1 

Factor  II  columns  n-1 

Factor  III  latin  characters  n-1 

Factor  IV  greek  characters  n-1 

Experimental  error  (n-l)(n-3) 


Total  n2-l 


Y 2 

n 

2 

n 

_ j 

Y2 

1 ••(••) 

n 

2 

n 

_ k 

Y2 

1 ••(••) 

n 

2 

n 

£ ^••(•0 
_ i 

Y2 

1 ••(••) 

n 

2 

n 

e-EEEE  Ya(U) 

i j k l 

SSR  ssc 

-SSL- 

Y2 

ssT  = EEEE^)''^ 

i j k l n 


MS 


msr  = 

MSC  = 
MSl  = 
MSg  = 


SS^ 

n—1 


SSc 

n—1 


SS, 

n—1 


SSc, 

n—1 


msb  = 


SS, 


(n-l)(»-3) 


The  mathematical  model  of  Graeco- Latin  squares  has  the  form: 

yij(ki)=H+ai+(3j+Tk+Y1+Eij(ki)  (2.53) 

i,  j,  k,  1=1,  2,...,  n 
where: 

Ea»  = EPJ-  = E'ti  = EYI=o 

Sij(ki)=N  (0,  o2) 

Analysis  of  variance  for  Graeco-Latin  squares  is  shown  in  Table  2.67. 

Example  2.22  [21] 

The  subject  of  research  is  a change  in  technology  of  tempering  copper  pipes.  The 
basic  requirement  to  be  observed  when  doing  these  changes  is  that  minimal  tensile 
strength  should  be  17  T/in2.  It  has  been  suggested  in  the  new  technological  proce- 
dure to  temper  copper  pipes  at  a lower  temperature  and  not  to  draw  them  out  after 
tempering.  The  experiment  has  been  designed  exactly  to  research  the  suggested 
modification  of  the  technological  procedure.  In  defining  the  design  of  experiment 
one  should  take  care  of  inequalities  such  as  variations  in  material  and  the  inside 
temperature  of  the  furnace  for  tempering.  Variations  in  quality  of  the  initial  material 
are  accounted  for  by  drawing  out  randomly  eight  copper  pipes  on  each  of  eight  days. 
These  eight  days  are  distributed  within  a three-week  period  with  the  idea  of  covering 
normal  variations  in  quality  of  the  observed  technological  procedure.  In  order  to 
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eliminate  unequal  distribution  of  furnace  temperatures,  the  furnace  has  been 
divided  into  eight  rows  and  eight  columns,  or  into  64  shelves.  Next,  the  experiment 
has  been  designed  by  Graeco- Latin  squares  with  two  factors: 

• day  of  production; 

• number  added  to  each  copper  pipe  in  a sample  of  eight. 

The  factor  number  has  no  physical  meaning  and  it  has  been  included  to  enable 
identification  of  tensile  strength  results  for  each  pipe  at  different  tempering  temper- 
atures. In  a different  technological  procedure  these  numbers  might  indicate  the 
order  of  daily  production  or  different  procedures  of  preparing  copper  pipes,  etc. 
Results  of  the  experiment  by  a Graeco-Latin  square  are  shown  in  Table  2.68.  The 
table  gives  rows  and  columns  as  positions  in  the  furnace,  characters  A to  H as  day  of 
production,  and  numbers  1 to  8 as  copper  pipes  in  the  sample.  The  results  of  mea- 
surements give  tensile  strength  in  tons  per  square  inch. 

Table  2.68  Design  of  experiment  of  8 x 8 Graeco-Latin  square 


2 

3 

4 

5 

6 

7 

8 

Sum 

1 

16.6  D3 

16.9  H4 

17.4  C5 

17.4  B6 

15.8  E8 

18.2  A1 

15.7  G2 

15.8  F7 

133.8 

2 

15.9  F6 

16.4  E5 

15.8  G4 

19.0  A3 

17.6  H2 

17.8  B7 

18.9  C8 

17.1  D1 

138.5 

3 

17.1  B5 

16.8  C6 

19.2  H3 

16.6  D4 

15.8  G1 

17.8  F8 

18.4  E7 

18.3  A2 

140.0 

4 

17.7  A4 

15.9  G3 

16.3  E6 

16.0  F5 

17.6  C7 

17.8  D2 

18.1  HI 

16.5  B8 

135.9 

5 

17.4  Cl 

17.0  B2 

16.8  D8 

19.2  H7 

20.3  A5 

18.4  E3 

15.9  F4 

15.7  G6 

140.7 

6 

16.5  E2 

16.0  FI 

16.9  A7 

15.9  G8 

17.1  D6 

17.5  C4 

17.4  B3 

19.6  H5 

136.9 

7 

15.8  G7 

16.9  A8 

15.9  F2 

16.5  El 

17.6  B4 

19.4  H6 

17.1  D5 

18.5  C3 

137.5 

8 

18.6  H8 

17.4  D7 

17.4  B1 

19.2  C2 

16.8  F3 

15.7  G5 

17.4  A6 

18.4  E4 

140.9 

Sum 

135.6 

133.3 

135.7 

139.8 

138.6 

142.6 

138.9 

139.7 

1104.2 

Table  2.69  Analysis  of  variance  of  8 x 8 Graeco-Latin  square 


Source  of  variations 

f 

SS 

MS 

F 

Ff;35;0.95 

Between  rows 

7 

543.2 

77.6 

1.47 

2.29 

Between  columns 

7 

769.9 

110.0 

2.08 

2.29 

Between  positions  ** 

14 

1313.1 

93.8 

1.77 

1.98 

Between  days 

7 

4831.7 

690.3 

13.05 

2.29 

Between  days  * 

7 

322.2 

46.0 

0.87 

2.29 

Experimental  error  * 

35 

1852.9 

52.9 

- 

- 

Total 

63 

8319.9 

- 

- 

- 

As  expected,  the  effect  of  differences  between  numbers  is  statistically  unimpor- 
tant and  is  of  the  same  size  as  the  experimental  error.  Hence  we  may  join  sums  of 
squares  for  the  factor  between  numbers  and  experimental  error  into  a new  com- 
bined experimental  error*,  with  42  degrees  of  freedom.  Variations  between  columns 
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and  rows  in  the  furnace  may  also  be  combined,  so  that  the  effect  between  posi- 
tions** in  the  furnace  is  at  the  limit  of  statistical  significance.  A high  statistical  sig- 
nificance has  the  effect  of  differences  between  copper  pipes  from  day  to  day.  This 
has  been  later  on  proved,  since  compositions  of  copper  pipes  in  days  F and  G dif- 
fered. As  tensile  strength  level  for  tempering  300  °C  with  no  later  drawing  out  of  all 
copper  pipes  has  been  below  17,  the  change  in  technological  procedure  cannot  be 
accepted. 

Example  2.23  [25] 

The  objective  of  this  experiment  is  to  define  effects  of  the  following  factors  on  the 
corrosion  rate  of  silicon  rods:  (1)  decolorisation  of  nitric  acid,  (2)  volume  of  corro- 
sion matter,  (3)  size  of  silicon  rod  and  (4)  time  spent  in  corrosion  matter.  The  experi- 
ment went  the  like  this:  five  bottles  of  recent  nitric  acid  was  exposed  to  different 
sunlight  activity  until  a change  in  its  color  from  colorless  to  light  yellow  occurred. 

The  same  kinds  of  nitric  acid  were  used  to  produce  a solution  for  corrosion.  Five 
groups  of  silicon  rods  were  sorted  out  by  their  weights.  Five  different  volumes  of 
corrosion  matter  and  five  corrosion  times  were  used  in  the  experiment.  Order  of 
doing  the  design  point  was  random.  The  corrosion  intensity  measure  is  the  loss  in 
weight  of  silicon  rods  after  rinsing  and  drying  them.  The  results  of  the  experiment 
are  shown  in  Table  2.70. 

Table  2.70  Design  of  experiment  of  5 x 5 Graeco-Latin  square 


Volume  of  corrosion 
matter 

Color  of  nitric  acid 

1 

2 

3 

4 

5 

1 

65  A1 

82  B3 

108  C5 

101  D2 

126  E4 

2 

84  B2 

109  C4 

73  D1 

97  E3 

83  A5 

3 

105  C3 

129  D5 

89  E2 

89  A4 

52  B1 

4 

119  D4 

72  El 

76  A3 

117  B5 

84  C2 

5 

97  E5 

59  A2 

94  B4 

78  Cl 

106  D3 

Results  of  analysis  of  variance  are  shown  in  Table  2.71. 
Table  2.71  Analysis  of  variance  of  5 x 5 Graeco-Latin  square 


Sources  of  variations 

f 

SS 

MS 

F 

F f;8;0.95 

Color  of  acid 

4 

227.76 

56.94 

0.47 

3.84 

Volume 

4 

285.76 

71.44 

0.59 

3.84 

Size  of  rods 

4 

2867.76 

716.94 

5.96 

3.84 

Time  of  corrosion 

4 

5536.56 

1384.14 

11.50 

3.84 

Experimental  error 

8 

962.72 

120.34 

- 

- 

Total 

24 

9880.56 

- 

- 

- 
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Problem  2.12  [25] 

Analyze  the  results  of  a 44  Graeco-Latin  square. 

Table  2.72  Design  of  experiment  of  4 x 4 Graeco-Latin  square 


Rows 

Columns 

Sum 

i 

2 

3 

4 

1 

6A1 

4B3 

7C4 

5 D2 

22 

2 

5 B2 

6A4 

3 D3 

4 Cl 

18 

3 

4C3 

5 D1 

8A2 

4B4 

21 

4 

3 D4 

2 C2 

8 B1 

6 A3 

19 

Sum 

18 

17 

26 

19 

80 

Problem  2.13  [25] 

The  data  given  below  are  results  of  25  design  points  performed  at 
five  temperatures  and  with  five  different  time  periods,  with  the  idea 
of  establishing  effects  of  the  given  factors  on  conversion  in  a chemi- 
cal reactor.  To  avoid  inequality  effects,  five  chemical  reactors  and 
five  operators  were  included  in  the  experiment.  So,  25  design  points 
were  done  in  five  reactors  with  five  operators  by  design  of  experi- 
ment of  a 5x5  Graeco-Latin  square  in  such  a way  that  each  operator 
used  each  reactor  only  once  at  each  temperature  and  for  a constant 
conversion  time  period.  Characters  denote  reactors  and  numbers 
the  operators.  Do  the  analysis  of  variance. 


Table  2.73  Design  of  experiment  of  5 x 5 Graeco-Latin  square 


Temperature 

Time 

Sum 

30 

60 

90 

120 

150 

100 

16  A1 

40  B3 

50  C5 

20  D2 

15  E4 

141 

125 

30  B2 

25  C4 

62  D1 

67  E3 

30  A5 

214 

150 

50  C3 

50  D5 

83  E2 

85  A4 

45  B1 

313 

175 

80  D4 

80  El 

95  A3 

98  B5 

70  C2 

423 

200 

90  E5 

92  A2 

98  B4 

100  Cl 

88  D4 

468 

Sum 

266 

287 

388 

370 

248 

1559 

2.2.6 

Youdens  Squares 

Observe  the  design  of  an  incomplete  random  block  with  four  blocks  and  four  levels 
of  researched  factor. 

This  design  is  shown  in  Table  2.74. 
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Table  2.74  Design  of  experiment 


Block 

Factor 

1 

A 

B 

C 

2 

D 

C 

A 

3 

B 

A 

D 

4 

C 

D 

B 

This  kind  of  design  of  experiment  is  known  as  Youdens  square  and  is  in  essence 
an  incomplete  design  of  a Latin  square  as  the  fourth  column  with  D factor  levels  in 
the  first  block,  B in  the  second,  C in  the  third  and  A in  the  fourth  are  missing. 
Design  of  experiment  from  Table  2.74  may  be  written  differently,  as  shown  in  Table 
2.75. 


Table  2.75  Youdens  design  of  experiment 


Block 

Factor 

A 

B 

c 

D 

1 

a 

P 

Y 

- 

2 

y 

- 

P 

a 

3 

P 

a 

- 

Y 

4 

- 

Y 

a 

P 

The  design  of  experiment  written  in  this  form  is  a reconstructed  Latin  square 
design  where  one  of  the  diagonals  has  been  left  out.  Generally  speaking,  Youdens 
square  is  a symmetrically  balanced  incomplete  random  block  where  each  factor  level 
appears  once  and  only  once  in  each  block  position. 

Youdens  square  is  always  a Latin  square  where  one  or  more  columns  (or  rows  or 
diagonals)  have  been  left  out;  however,  the  opposite  is  not  true;  a Latin  square  where 
one  or  more  columns  (or  rows  or  diagonals)  have  been  left  out  is  not  always  a You- 
dens square,  for  by  leaving  out  columns  from  a Latin  square  the  balance  in  design  is 
lost.  It  is,  however,  possible  to  construct  designs  of  Youdens  squares  from  all  sym- 
metrical balanced  random  blocks  [26].  Youdens  squares  have  the  same  number  of 
rows  and  levels  of  a researched  factor  but  quite  a different  number  of  columns. 

This  notation  is  used: 

1)  I number  of  levels  of  researched  factor; 

2)  J number  of  levels  of  one  inequality  source-blocks; 

3)  K number  of  levels  of  the  second  source  of  inequality; 

4)  L number  of  replications  of  each  factor  level 

Following  Youdens  squares  may  be  used  for  practical  needs: 
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Table  2.76 

l=J=7;  K= 

=L=3 

Table  2.77 

H= 

1 1 ; K=L= 

5 

Table  2.78 

H=7; 

K=L=4 

Rows 

Columns 

Rows 

Columns 

Rows 

Columns 

1 

2 

3 

1 

2 

3 

4 

5 

1 

2 

3 

4 

1 

G 

A 

C 

1 

A 

B 

C 

D 

E 

1 

D 

F 

G 

A 

2 

A 

B 

D 

2 

G 

A 

F 

J 

C 

2 

E 

G 

A 

B 

3 

B 

C 

E 

3 

I 

H 

A 

F 

B 

3 

F 

A 

B 

C 

4 

C 

D 

F 

4 

K 

I 

G 

A 

D 

4 

G 

B 

C 

D 

5 

D 

E 

G 

5 

J 

K 

E 

H 

A 

5 

A 

C 

D 

E 

6 

E 

F 

A 

6 

H 

G 

B 

C 

K 

6 

B 

D 

E 

F 

7 

F 

G 

B 

7 

B 

F 

D 

K 

I 

7 

C 

E 

F 

G 

8 

F 

C 

I< 

E 

I 

9 

C 

D 

J 

I 

H 

10 

E 

J 

I 

B 

G 

11 

D 

E 

H 

G 

F 

Table  2.79  l=J=13;  K=L=4 


Rows 

Columns 

1 

2 3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

1 

A 

B C 

D 

E 

F 

G 

H 

I 

T 

K 

L 

M 

2 

B 

C D 

E 

F 

G 

H 

I 

J 

K 

L 

M 

A 

3 

D 

E F 

G 

H 

I 

J 

K 

L 

M 

A 

B 

C 

4 

J 

K L 

M 

A 

B 

C 

D 

E 

F 

G 

H 

I 

Table  2.80  1 

=J=1 5;  K=L=7 

Rows 

Columns 

1 

2 3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

1 

A 

B C 

D 

E 

F 

G 

H 

I 

J 

I< 

L 

M 

N 

O 

2 

B 

C D 

E 

F 

G 

H 

I 

I 

I< 

L 

M 

N 

O 

A 

3 

C 

D E 

F 

G 

H 

I 

J 

K 

L 

M 

N 

O 

A 

B 

4 

E 

F G 

H 

I 

J 

K 

L 

M 

N 

O 

A 

B 

C 

D 

5 

F 

G H 

I 

J 

K 

L 

M 

N 

O 

A 

B 

C 

D 

E 

6 

I 

J K 

L 

M 

N 

O 

A 

B 

C 

D 

E 

F 

G 

H 

7 

K 

L M 

N 

O 

A 

B 

C 

D 

E 

F 

G 

H 

I 

J 

Table  2.81  l=J=16;  K=L=6 
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Rows 

Columns 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

1 

A 

B 

C 

D 

E 

F 

G 

H 

I 

I 

K 

L 

M 

N 

O 

P 

2 

B 

C 

D 

A 

F 

G 

H 

E 

J 

K 

L 

I 

N 

O 

P 

M 

3 

C 

D 

A 

B 

G 

H 

E 

F 

K 

L 

I 

I 

O 

P 

M 

N 

4 

E 

F 

G 

H 

I 

J 

I< 

L 

M 

N 

O 

P 

A 

B 

C 

D 

5 

L 

I 

I 

K 

P 

M 

N 

O 

D 

A 

B 

C 

H 

E 

F 

G 

6 

M 

N 

O 

P 

A 

B 

C 

D 

E 

F 

G 

H 

I 

J 

K 

L 

Table  2.82 

H= 

II 

vb~ 

4=10 

Rows 

Columns 

1 

2 

3 

4 

5 

e 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

1 

A 

B 

C 

D 

E 

F 

G 

H 

I 

I 

K 

L 

M 

N 

O 

P 

2 

C 

A 

B 

E 

F 

D 

J 

I 

H 

G 

M 

K 

L 

P 

N 

O 

3 

D 

C 

A 

K 

M 

G 

H 

E 

L 

I 

I 

B 

P 

O 

F 

N 

4 

N 

E 

P 

A 

H 

B 

D 

C 

F 

K 

O 

G 

I 

I 

L 

M 

5 

M 

N 

O 

P 

B 

A 

F 

D 

E 

C 

G 

I 

J 

H 

K 

L 

6 

B 

I 

H 

G 

A 

I 

L 

O 

M 

N 

D 

C 

E 

F 

P 

I< 

7 

L 

K 

I 

B 

O 

P 

N 

A 

D 

F 

C 

H 

G 

E 

M 

J 

8 

I 

H 

F 

L 

G 

M 

A 

P 

K 

O 

B 

N 

C 

D 

E 

I 

9 

I 

P 

L 

O 

N 

K 

C 

M 

J 

A 

H 

E 

F 

B 

D 

G 

10 

O 

M 

K 

I 

L 

N 

P 

G 

A 

E 

F 

D 

B 

I 

C 

H 

Youdens  square  I=J=7;  K=L=4  is  transformed  into  a balanced  incomplete  random 
block  for  easier  calculation: 

It  should  be  noted  that  levels  of  second  degree  of  inequality  are  given  by  numbers 
in  brackets. 

Table  2.83  Youdens  7x4  square 


Factor 

Blocks 

1 

2 

3 

4 

5 

6 

7 

A 

(4) 

(3) 

(2) 

(1) 

B 

(4) 

(3) 

(2) 

(1) 

C 

(4) 

(3) 

(2) 

(1) 

D 

(1) 

(4) 

(3) 

(2) 

E 

(1) 

(4) 

(3) 

(2) 

F 

(2) 

(1) 

(4) 

(3) 

G 

P) 

(2) 

(1) 

(4) 
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The  previously  introduced  notation  says: 

• I number  of  levels  of  researched  factor; 

• J number  of  levels  of  one  inequality  source-blocks; 

• K number  of  levels  of  the  second  source  of  inequality; 

• L number  of  replications  of  each  factor  level. 

Analysis  of  variance  for  Youdens  squares  is  shown  in  Table  2.84. 


Table  2.84  Analysis  of  variance  for  youdens  square 


Sources  of  variations  f SS  MS  F 


Blocks  1-1 

Corrected  factor  1-1 

Factor  J-l 

Corrected  blocks  J-l 

Second  source  of  K-l 

inequality 


Experimental  error  IK-J-I-K+2 


Total  IK-1 


SS,  = 
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Note:  SS1+SS2=SS3+SS4;  I=J;  K=L. 
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MSj 

77 

ms; 

ss2 
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77 
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ss3 
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J-l 
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E 
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MS4 

J-l 

mse 

SS 
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77 
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E 

SS, 

IK-J-I-K+2 
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Example  2.24 

In  researching  resistance  on  rubber  abrasion,  a Martindale  tester  is  used  for  com- 
parasions  of  various  rubber  samples.  Five  types  of  rubbers  have  been  tested  in  five 
cycles  on  four  tester  positions.  Design  of  experiment  for  five  rubber  types  and  five 
test  cycles  is  the  Youdens  square  shown  in  Table  2.85 
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Table  2.85  Youdens  square 


Cycle 

Position  on  tester 

a 

P 

Y 

5 

1 

A 

B 

C 

D 

2 

E 

A 

B 

C 

3 

D 

E 

A 

B 

4 

C 

D 

E 

A 

5 

B 

C 

D 

E 

Design  of  experiment  and  test  results  are  shown  in  a more  suitable  form  for  cal- 
culations in  Table  2.86. 


Table  2.86  Youdens  square 


Rubber  type 
treatment 

Test  cycle-block 

Sum 

i 

2 

3 

4 

5 

A 

68  a 

33  p 

54  y 

81  6 

- 

236 

B 

49  p 

31  y 

114  6 

- 

40  a 

234 

C 

80  y 

91  6 

- 

56  a 

50  p 

277 

D 

716 

- 

65  a 

50  p 

48  y 

234 

E 

- 

51  a 

49  p 

70  y 

89  6 

259 

Sum 

268 

206 

282 

257 

227 

1240 

1=5;  J=5;  K=4=L 

2682  +2062  +2822  +2572  +2272  12402  311362.0  1537600  r„ 

SSi  = = = 960.50 

1 4 5x4  4 20 

ss2  = ^ 

5x42(4— 1) 


(4  x 236  - 268  - 206  - 282  - 257) 


+...  + (4  x 259  - 206  - 282  - 257  - 227)  ] = 719.50 

2 


ss3  = 
ss4  = 


2362  +2342  +2772  +2342  +2592 


1240 

5x4 


= 374.50 


5-1 


5x4  (4-1) 


(4  x 268  - 236  - 234  - 277  - 234) 


+...  + (4  x 227  - 234  - 277  - 234  - 259)  ] = 1305.5 


ce  _ (68+40+56+65+51)2+...+(81+114+91+71+89)2  12402  _ 

JO5  — — — ~ — — j/  / j.A\j 


20 


SST  = (W  + 332  + 542  + 8I2  + ...  + 512  + 492  + 702  + 892)  - 


1240 


= 85358  - 76880  = 8478 
SSE=8478-960.50-719. 50-5273. 20=1524. 80 


20 
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Complete  the  analysis  of  variance  is  shown  in  Table  2.87. 

It  is  evident  that  factor  and  blocks  have  no  statistically  significant  effect.  However, 
between  sample  positions  on  the  tester  there  exists  a statistically  significant  differ- 
ence. Position  D gives  much  higher  rubber  abrasions,  which  may  mean  that  sam- 
ples of  rubber  have  not  been  mounted  properly  in  this  position. 


Table  2.87  Analysis  of  variance  of  Youdens  square 


Sources  of  variations 

f 

SS 

MS 

F 

Between  blocks 

4 

960.50 

reject 

- 

Corrected  factor 

4 

719.50 

179.9 

0.94 

Factor 

8 

1680.00 

- 

- 

Factor 

4 

374.50 

reject 

- 

Corrected  blocks 

4 

1305.50 

326.4 

1.71 

Between  blocks 

8 

1680.00 

- 

- 

Between  tester  positions 

3 

5273.20 

1757.7 

9.22 

Experimental  error 

8 

1524.80 

190.6 

- 

Total 

19 

8478.00 

- 

- 

Example  2.25  [25] 

Raw  gasoline  and  six  additives  to  be  added  into  it  were  used  to  test  the  octane  num- 
ber of  gasoline.  Blocks  are  orders  of  testing,  and  columns  are  testing  times.  Design 
of  experiment  corresponded  to  a Youdens  73  square.  Results  of  the  experiment  are 
shown  in  Table  2.88. 

Table  2.88  Youdens  7 x 3 square 


Blocks 

Columns 

Sum 

2 

3 

1 

43  A 

34  B 

47  D 

124 

2 

36  B 

32  C 

46  E 

114 

3 

33  C 

47  D 

43  F 

123 

4 

44  D 

40  E 

33  G 

117 

5 

41  E 

35  F 

44  A 

120 

6 

36  F 

32  G 

32  B 

100 

7 

33  G 

41  A 

27  C 

101 

Sum 

266 

261 

272 

799 
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Table  2.89  Youdens  7x3  square 


Factor 

Blocks 

Sum 

i 

2 

3 

4 

5 

6 

7 

A 

43(1) 

- 

- 

- 

44(3) 

- 

41(2) 

128 

B 

34(2) 

36(1) 

- 

- 

- 

32(3) 

- 

102 

C 

- 

32(2) 

33(1) 

- 

- 

- 

27(3) 

92 

D 

47(3) 

- 

47(2) 

44(1) 

- 

- 

- 

138 

E 

- 

46(3) 

- 

40(2) 

41(1) 

- 

- 

127 

F 

- 

- 

43(3) 

- 

35(2) 

36(1) 

- 

114 

G 

- 

- 

- 

33(3) 

- 

32(2) 

33(1) 

98 

Sum 

124 

114 

123 

117 

120 

100 

101 

799 

For  the  sake  of  easier  calculation  Table  2.88  is  transformed  into  Table  2.89. 

rtn  1242+1142+1232+1172+1202+1002+1012  7992 

SSl= 3 7^3  = 1%'95 

SS2  = i [(3  x 128  - 124  - 120  - 101)2 

7x3  (3—1)  L 

+...  + (3  x 98  - 117  - 100  - 101)2]  = 493.62 


SS,  = 


1282+1022  +922+1382+1272+1142  +982  7992 


7x3 


= 608.29 


SS4  = ^-2 [(3  x 124  - 128  - 102  - 138)2 

7x3  (3-1)  L 


SSt  = 


+...  + (3  x 101  - 128  - 92  - 98)  ] = 82.29 

(43+36+33+44+41+36+33)2+...+(44+32+27+47+46+43+33)2  7992 


2662+2612+2722 


7 


- 30400.05  = 8.67 


7x3 


SST  = (432  + 442  + 412  + 342  + ...  + 362  + 332  + 322  + 332) 
SSE=706. 95-196. 95-493. 62-8. 67=7. 71 


799 

7x3 


= 706.95 


Complete  analysis  of  variance  is  shown  in  Table  2.90. 

Analysis  of  variance  shows  that  addition  of  additives  to  basic  gasoline  has  a statis- 
tically significant  effect.  The  order  of  testing  gasoline  or  differences  between  blocks 
are  also  statistically  very  important.  The  time  factor  is  not  statistically  significant. 
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Table  2.90  Analysis  of  variance  of  youdens  square 


Sources  of  variations 

f 

ss 

MS 

F 

Ff;6;0.95 

Order-blocks 

6 

196.95 

- 

- 

- 

Gasoline  type-corrected  factor 

6 

493.62 

82.27 

64.27 

4.28 

Gasoline  type-factor 

6 

608.29 

- 

- 

- 

Order-corrected  blocks 

6 

82.29 

13.72 

10.72 

4.28 

Time 

2 

8.67 

4.34 

3.39 

5.14 

Experimental  error 

6 

7.71 

1.28 

- 

- 

Total 

20 

706.95 

- 

- 

- 

Problem  2.14  [25] 

On  the  machine  for  abrasion  testing  four  samples  of  dyestuff  were 
tested.  The  machine  has  three  positions  for  testing  in  the  same  time 
simultaneously.  Tests  have  been  made  in  four  cycles.  The  design  of 
experiment  was  Youdens  43.  Results  are  shown  in  the  next  table.  Do 
analysis  of  variance. 


Table  2.91 

Youdens 

square 

Positions 

Cycles 

i 

2 

3 

4 

1 

13  A 

20  D 

23  C 

21  B 

2 

15  B 

10  A 

26  D 

29  C 

3 

18  C 

11  B 

15  A 

28  D 

Problem  2.15 

Seven  procedures  of  a critical  material  state  that  their  product  satis- 
fies very  precise  tensile  strength  requirements.  Each  producer  has 
supplied  enough  material  for  four  tests.  Testing  this  material  has 
been  designed  under  seven  conditions.  Tensile  strength  tests  have 
been  done  on  four  testers  (1,  2,  3 and  4).  Testing  has  been  done  by 
Youdens  square  design.  Process  the  outcomes  of  this  testing  by  ana- 
lysis of  variance. 
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Conditions 

Producers 

Sum 

i 

2 

3 

4 

5 

6 

7 

1 

1.62(4) 

2.10(3) 

1.50(2) 

- 

2.30(1) 

- 

- 

7.52 

2 

- 

1.93(4) 

1.90(3) 

1.77(2) 

- 

1.64(1) 

- 

7.24 

3 

- 

- 

2.22(4) 

1.56(3) 

2.29(2) 

- 

1.92(1) 

7.99 

4 

2.14(1) 

- 

- 

1.58(4) 

1.88(3) 

1.81(2) 

- 

7.41 

5 

- 

1.65(1) 

- 

- 

1.61(4) 

1.65(3) 

2.03(2) 

6.94 

6 

1.99(2) 

- 

1.64(1) 

- 

- 

1.74(4) 

2.46(3) 

7.83 

7 

1.75(3) 

2.46(2) 

- 

1.86(1) 

- 

- 

2.62(4) 

8.69 

Sum 

7.50 

8.14 

7.26 

6.77 

8.08 

6.84 

9.03 

53.62 

Summary 

This  chapter  about  design  of  experiments  refers  to  screening  the  factors  by  the  sig- 
nificance of  their  effect  on  a measured  value  or  response.  The  researcher  has  by 
applying  the  mentioned  methodology: 

• defined  all  factors  and  responses  of  the  research  subject; 

• defined  all  variable  parameters  of  the  research  subject  as  random  values; 

• clarified  the  random  property  of  the  research  subject  variables  and  checked 
whether  their  distribution  laws  correspond  to  a normal  distribution; 

• estimated  correlation  between  variables  of  the  research  subject; 

• that  factors,  in  accord  with  preliminary  information,  are  ranked  according  in 
order  of  their  effects  on  response  by  using  a preliminary  ranking  method; 

• screened  out  factors  into  significant  and  random  ones  as  regards  their  affect 
on  the  research  subject,  by  the  random  balance  method. 

Based  on  the  results  of  a screening  experiment  and  the  objective  of  the  research 
problem,  the  researcher  decides  about  including  a system  factor  and  response  into 
the  design  of  the  basic  experiment.  This  does  not  involve  all  the  information  of  a 
selective  experiment: 

• Information  on  factor-variation  intervals  may  be  drawn  from  an  active  experi- 
ment by  the  random  balance  method.  Thus,  linear  effects  of  factors  X:X2  in 
Example  2.10  considerably  exceed  the  affects  of  other  factors.  This  simulta- 
neously may  mean  that  the  selected  factor-variation  intervals  X^s-t-);  X2(-;+) 
are  too  high.  If  this  is  so,  then  they  should  be  cut  in  half  in  the  basic  experi- 
ment. 

• Factor  space  may  be  obtained  from  the  matrix  of  random  balance  and  analy- 
sis of  variance.  Information  on  number  of  replications  of  design  points-trials 
in  the  basic  experiment  is  obtained  from  analysis  of  variance,  and  some 
proofs  about  linear  or  nonlinear  relationships  between  variables  of  the 
research  subject  from  correlation  analysis. 
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Hence,  a more  complete  conclusion  on  all  that  is  obtained  from  screening  experi- 
ments should  include: 

• factor  space; 

• factor-variation  intervals; 

• preliminary  number  of  design  points /trials  and  replications; 

• number  of  factors  and  their  interactions,  which  should  be  included  into  the 
basic  experiment. 


2.3 

Basic  Experiment-Mathematical  Modeling 

General  form  of  mathematical  model  for  a research  subject  has  been  defined  in 
Sect.  2.1.1: 

y = (p(X1,X2,...,Xfe) 

To  select  a mathematical  model  means  to  choose  a form  of  mathematical  function 
and  write  down  the  associated  equation.  The  next  step  is  designing  and  performing 
the  experiment  based  on  the  results  of  which  constants  or  coefficients  of  the  chosen 
mathematical  model  will  be  determined.  Choice  of  mathematical  model  is  the  next 
question.  To  obtain  a satisfactory  answer,  let  us  observe  the  geometrical  interpreta- 
tion of  this  model  or  response  functions.  The  mentioned  response  function  geometri- 
cally is  often  called  the  response  surface.  In  cases  when  the  response  is  a function  of 
several  factors,  the  possibility  of  geometrical  interpretation  is  lost.  We  are  entering 
the  area  of  abstract  multidimensional  space  where  it  is  difficult  to  get  oriented.  In 
such  cases  we  switch  to  the  language  of  algebra  where  computers  help  us  find  our 
way  much  more  easily. 

Let  us  stick  to  response  geometrical  interpretation  of  “black  box”  with  two  input 
factors.  A simple  graphic  system  with  x-y  coordinates  is  sufficient  for  this.  One  may 
insert  values  of  variation  levels  of  one  factor  on  one  axis,  and  those  of  the  other  fac- 
tor on  another  axis.  Each  ’’black  box”  status  will  have  a corresponding  point  in  the 
surface.  As  has  been  said  in  Sect.  2.1.3,  factors  are  defined  by  their  domains.  This 
means  that  each  factor  is  defined  by  its  minimal  and  maximal  values  where  it  may 
be  changed  continuously  or  discontinuously.  If  the  factors  are  concordant  then  those 
limits  in  the  plane  form  a rectangle  within  which  are  the  points  that  coincide  with 
“black  box”  statuses.  Dashed  lines  in  Fig.  2.28  mark  the  limit  values  of  the  domain  of 
factors  and  full  lines  the  limits  of  concordant  domain  of  factors.  To  present  graphically 
the  response  values,  we  use  the  third  axis  of  the  coordinate  system,  so  that  the 
response  surface  has  the  shape  given  in  Fig.  2.29. 

The  area  where  the  response  surface  has  been  constructed  is  called  the  factor 
space.  The  area  taken  by  factor  axes  is  often  considered  as  the  factor  space.  A 
response  function  does  not  have  to  be  geometrically  interpreted  in  a three-dimen- 
sional space  for  a research  subject  defined  by  only  two  factors.  For  such  a presenta- 
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Hence,  a more  complete  conclusion  on  all  that  is  obtained  from  screening  experi- 
ments should  include: 

• factor  space; 

• factor-variation  intervals; 

• preliminary  number  of  design  points /trials  and  replications; 

• number  of  factors  and  their  interactions,  which  should  be  included  into  the 
basic  experiment. 

2.3 

Basic  Experiment-Mathematical  Modeling 

General  form  of  mathematical  model  for  a research  subject  has  been  defined  in 
Sect.  2.1.1: 

y = <p(X1,X2,...,Xfe) 

To  select  a mathematical  model  means  to  choose  a form  of  mathematical  function 
and  write  down  the  associated  equation.  The  next  step  is  designing  and  performing 
the  experiment  based  on  the  results  of  which  constants  or  coefficients  of  the  chosen 
mathematical  model  will  be  determined.  Choice  of  mathematical  model  is  the  next 
question.  To  obtain  a satisfactory  answer,  let  us  observe  the  geometrical  interpreta- 
tion of  this  model  or  response  functions.  The  mentioned  response  function  geometri- 
cally is  often  called  the  response  surface.  In  cases  when  the  response  is  a function  of 
several  factors,  the  possibility  of  geometrical  interpretation  is  lost.  We  are  entering 
the  area  of  abstract  multidimensional  space  where  it  is  difficult  to  get  oriented.  In 
such  cases  we  switch  to  the  language  of  algebra  where  computers  help  us  find  our 
way  much  more  easily. 

Let  us  stick  to  response  geometrical  interpretation  of  “black  box”  with  two  input 
factors.  A simple  graphic  system  with  x-y  coordinates  is  sufficient  for  this.  One  may 
insert  values  of  variation  levels  of  one  factor  on  one  axis,  and  those  of  the  other  fac- 
tor on  another  axis.  Each  ’’black  box”  status  will  have  a corresponding  point  in  the 
surface.  As  has  been  said  in  Sect.  2.1.3,  factors  are  defined  by  their  domains.  This 
means  that  each  factor  is  defined  by  its  minimal  and  maximal  values  where  it  may 
be  changed  continuously  or  discontinuously.  If  the  factors  are  concordant  then  those 
limits  in  the  plane  form  a rectangle  within  which  are  the  points  that  coincide  with 
“black  box”  statuses.  Dashed  lines  in  Fig.  2.28  mark  the  limit  values  of  the  domain  of 
factors  and  full  lines  the  limits  of  concordant  domain  of  factors.  To  present  graphically 
the  response  values,  we  use  the  third  axis  of  the  coordinate  system,  so  that  the 
response  surface  has  the  shape  given  in  Fig.  2.29. 

The  area  where  the  response  surface  has  been  constructed  is  called  the  factor 
space.  The  area  taken  by  factor  axes  is  often  considered  as  the  factor  space.  A 
response  function  does  not  have  to  be  geometrically  interpreted  in  a three-dimen- 
sional space  for  a research  subject  defined  by  only  two  factors.  For  such  a presenta- 
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tion  a space  where  intersection  lines  of  response  surface  are  projected  with  planes 
parallel  to  X^  X2  is  sufficient.  This  is  shown  in  Fig.  2.30. 


Figure  2.30  Contour  lines 


Point  M in  the  figure  is  the  optimum  that  by  one  definition  of  the  research  prob- 
lem objective  should  be  determined.  Each  intersection  line  in  the  plane  is  a line  of 
constant  response  values  and  is  called  contour  lines-contour  diagram. 

Since  the  way  of  interpreting  a response  function  is  clear,  the  basic  question  of 
finding  the  optimum  with  minimal  expenses  are  as  follows: 

• When  all  possible  response  values  are  available  for  all  combinations  of  factor 
variation  levels  in  the  form  of  a table,  there  is  no  problem  to  define  or  select 
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the  model,  nor  is  there  a problem  to  find  the  optimum.  An  optimum  is  the 
combination  of  factors  that  has  the  best  response  value.  As  there  are  endless 
combinations  of  factors,  this  case  is  of  no  practical  value. 

• The  second  approach  consists  of  choosing  a random  number  of  combina- 
tions of  factor  levels  and  their  responses  hoping  that  an  optimal  value  has 
been  included  or  in  a limit  case  it  is  a value  close  to  the  optimal.  Evidently, 
such  an  approach  includes  a great  risk. 

• The  third  approach  involves  mathematical  modeling  or  obtaining  a mathe- 
matical model  by  which  response  values  outside  the  region  of  the  experiment 
may  be  extrapolated.  Response  values  that  are  an  optimum  or  close  to  it  are 
estimated. 

Lack  of  all  possible  response  values,  for  all  possible  combinations  of  factor-varia- 
tion levels  is  paid  by  introduction  of  assumptions  about  an  unknown  model  for  use 
before  the  start  of  an  experiment.  The  main  assumptions  about  the  model  or 
response  surface  refer  to  continuity,  smoothness  and  existence  of  an  optimum. 
Such  assumptions  allow  approximations  of  response  functions  by  a polynomial  in 
the  vicinity  of  any  factor-surface  point.  In  the  case  of  one  optimum,  the  way  we 
approach  it  is  not  essential,  but  if  there  are  more,  the  problem  becomes  harder. 
Fig.  2.31  shows  two  response  functions  of  one  factor.  The  first  one  exactly  corre- 
sponds to  the  introduced  assumptions,  and  the  second  one  is  a case  where  assump- 
tions about  smoothness  and  continuity  are  broken,  for  the  optimum  and  peak  exist. 
If  in  searching  for  an  optimum  in  such  a function  we  start  gradually  from  one  side, 
we  shall  find  the  smaller  maximum  not  knowing  that  a bigger  one  exists  too.  If  we 
know  response  values  in  several  neighboring  points  of  a factor  space,  it  is  possible 
to  estimate  values  of  the  same  response  in  other  neighboring  points.  We  may  suc- 
cessively find  such  points  where  the  biggest  increase  (or  decrease  if  a minimum  is 
looked  for)  of  a response  value  is  expected.  Then  it  becomes  clear  that  the  next 
experiment  should  be  moved  into  those  very  points.  This  means  that  we  should 
move  into  that  direction  neglecting  others.  By  introducing  a new  experiment,  we 
may  determine  the  direction  of  fastest  response  changes  again  after  processing  the 
outcomes. 
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Figure  2.31  Response  functions 


To  put  it  more  clearly,  we  select  a point  and  a cluster  of  points  around  it  in  the 
factor  space,  that  is,  we  choose  a subfactor  space  in  it.  The  experiment  is  done  in 
that  region  and  based  on  it  the  first  model  is  defined.  Such  a model  is  used  to  esti- 
mate the  response  outside  the  experimental  region.  Such  an  estimation  of  a 
response  is  called  extrapolation.  If  the  same  is  done  for  points  within  the  experimen- 
tal region  it  is  then  called  interpolation.  Since  extrapolation  confidence  diminishes 
with  distance  from  this  region,  it  is  done  in  its  vicinity.  Extrapolation  outcomes  are 
used  to  choose  conditions  for  performing  the  next  experiment.  A further  procedure 
in  finding  the  optimum  is  repeated.  To  choose  a model  for  doing  the  first  so-called 
basic  experiment,  it  is  necessary  for  the  model  to  fulfill  certain  requirements. 

The  analyzed  approach  to  finding  the  optimum  indicates  that  the  model  must 
have  a possibility  to  estimate  the  direction  of  further  design  points  with  associated 
precision.  Since  we  are  unaware  of  the  direction  of  movement  towards  the  optimum 
before  defining  the  model,  the  accuracy  of  estimation  of  the  mathematical  model 
must  be  the  same  in  all  directions.  This  is  to  say  that  estimated  values  in  experiment 
region  may  differ  from  the  real,  measured  response  values  for  the  previously  given 
magnitude.  The  model  which  fulfills  this  requirement  is  called  adequate,  to  be  dis- 
cussed in  a separate  section. 

When  several  mathematical  models  fulfill  the  requirements,  then,  in  principle,  a 
simpler  model  is  accepted.  For  us,  it  is  the  polynomial  model  that  in  the  case  of  two 


factors  may  have  these  forms: 

• Polynomial  of  the  null  order: 

y=b0 

(2.54) 

• Polynomial  of  the  first  order: 

y = + £>2^2 

(2.55) 

• Polynomial  of  the  second  order: 

y = + b^X^  + bjXj  + /?i  2X^X2  + b^Xi  + ^22-^2 


(2.56) 
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• Polynomial  of  the  third  order: 

y = b0  + bxXx  + £>2X2  + bi 2X^X2  + b^Xi  + ^22^2  + ^112^-1  ^2 

+fo122x1x22  + bnlX 1 + b222X2  (2.57) 

This  means  that  we  have  presented  the  unknown  response  function  by  a polynomial 
of  the  corresponding  order.  The  operation  of  replacing  one  function  with  another  one  is 
called  approximation.  Since  a polynomial  may  be  of  a different  order,  it  is  necessary  to 
select  a polynomial  order  for  the  basic  experiment  or  for  the  first  step  towards  the  opti- 
mum. The  experiment  should  determine  numerical  values  of  the  polynomial  coeffi- 
cient. The  valid  rule  here  is:  the  higher  the  polynomial  order  the  more  coefficients  or 
design  points  in  the  experiment  exist.  As  the  objective  of  a designed  experiment  is  a 
minimum  of  design  points,  a polynomial  of  the  first-order  is  selected  for  the  basic 
experiment.  The  first  order  polynomial  contains  the  information  on  the  gradient  direc- 
tion or  the  direction  of  the  fastest  response  change,  and  it  has,  at  the  same  time,  the 
smallest  number  of  coefficients.  The  only  questionable  thing  is  whether  the  linear 
model  will  always  be  adequate.  It  is  however,  known  in  mathematics  that  each  point  has 
an  environment  where  that  model  is  adequate.  The  only  remaining  thing  is  selection  of 
the  factor  subdomain  where  the  linear  model  is  adequate.  The  size  of  that  subdomain  is 
not  known,  but  lack  of  fit  of  the  model  may  be  checked  by  outcomes  of  the  experiment. 
Hence,  an  arbitrary  subdomain  is  selected  in  advance,  to  be  corrected  later  to  the  corre- 
sponding size  after  checking  its  lack  of  fit. 

In  this  way  the  basic  experiment  is  defined  for  the  linear  model,  and  the  gradient 
that  indicates  the  direction  of  the  fastest  response  increase  or  decrease  is  obtained. 
When  a response  maximum  or  minimum  is  searched  for,  the  experimental  center  is 
moved  that  way  and  a new  experiment  for  the  linear  model  performed.  The  proce- 
dure is  repeated  until  moving  along  the  gradient  has  an  effect.  When  this  has  no 
effect,  it  means  we  are  close  to  the  optimum.  Polynomials  of  higher  order,  mostly 
the  second,  are  used  in  the  optimum  region. 

All  this  clearly  shows  that  selection  of  the  subdomain  for  performing  the  basic 
experiment  is  very  important  and  it  will  be  discussed  in  the  coming  sections. 

Apart  from  optimization,  a problem  is  often  set  for  mathematical  modeling  or 
interpolation.  The  optimum  does  not  interest  us  in  that  case  but  the  model  that  ade- 
quately describes  the  obtained  results  in  the  experimental  field.  A subdomain  is  not 
chosen  in  that  case,  but  the  polynomial  order  is  moved  up  until  an  adequate  model 
is  obtained.  When  a linear  or  incomplete  square  model  (with  no  members  with  a 
square  factors)  is  adequate  it  means  that  the  research  objective  corresponds  to  the 
optimization  objective. 

Summary 

Previous  considerations  have  given  a model  that  will  be  systematically  analyzed 
when  designing  a basic  experiment.  It  is  the  algebraic  polynomial  of  the  first-order 
or  linear  model. 

To  select  it,  it  has  been  necessary  to  analyze-geometric  interpretation  of  the 
response  surface  in  the  factorial  domain.  The  response  surface  is  defined  in  the 
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associated  factor  space  only.  In  this  domain,  one  response  value  corresponds  to  any 
combination  of  factors  or  state  of  the  research  subject.  A response  surface,  apart 
from  its  interpretation  in  a Cortesian  coordinate  system,  may  also  be  presented  in  a 
plane  in  the  form  of  a response  contour  diagram.  It  has  been  mentioned  that  the 
mathematical  model  in  the  first  phase  of  experimenting  serves  to  determine  the  gra- 
dient or  direction  of  the  fastest  response  change.  A model  in  the  form  of  an  analyti- 
cal function  with  one  optimum  has  been  adopted.  Since  the  analytical  function  is  in 
question,  it  may  be  approximated  by  a polynomial  around  any  point  in  the  factor 
space. 

Accepting  the  given  assumptions,  finding  an  optimum,  as  the  most  complex 
problem,  is  reduced  to  the  following  iterative  procedure:  set  up  the  experiment  with 
a minimum  of  design  points  and  from  its  outcomes  determine  the  model  base  on 
which  the  gradient  is  obtained,  and  the  procedure  is  repeated  in  that  direction  until 
an  optimum  is  reached.  The  model  must  meet  the  requirements  defined  by  its  lack 
of  fit  and  simplicity.  Adequacy  or  lack  of  fit  of  the  model  means  estimating  experi- 
mental response  values  with  the  necessary  accuracy  in  the  experimental  region. 

Lack  of  fit  of  the  model  is  checked  after  performing  the  experiment.  As  a simple 
model,  a polynomial  of  the  first  order  has  been  selected,  which  is  linear  by  its 
unknown  coefficients  and  determined  by  processing  experimental  results.  The  line- 
ar polynomial  has  been  selected  as  it  offers  a small  enough  experimental  region 
where  such  a model  is  adequate.  Choice  of  experimental  region  depends  on  the 
researcher  and  it  is  not  completely  formalized.  In  the  case  of  having  to  solve  the 
problem  of  modeling  or  interpolation  and  not  optimization,  the  experimental  region 
is  fixed  beforehand  and  the  procedure  completed  by  obtaining  an  adequate  model  in 
that  region.  When  a linear  model  is  inadequate,  the  polynomial  level  is  increased 
until  the  condition  of  lack  of  fit  is  fulfilled. 

2.3.1 

Full  Factorial  Experiments  and  Fractional  Factorial  Experiments 

When  designing  the  basic  experiment,  the  unknown  response  function  (2.5)  is,  in 
principle,  approximated  by  a polynomial  of  the  corresponding  degree  (2.6)  where 
regression  coefficients  are  estimated  on  the  basis  of  experimental  results  (2.7).  A lin- 
ear mathematical  model  is  considered  in  the  first  phase  of  a research.  Defining  the 
first  order  regression  model  is  the  first  phase  of  a study  objectiveed  at  obtaining  the 
interpolation  model  or  function,  the  knowledge  of  which  facilitates  estimating 
response  values  in  different  points  of  the  studied  factorial  space.  A linear  model  is, 
additionally,  also  used  when  moving  to  the  optimum  region,  the  same  as  when  we 
use  the  steepest  ascent  method  as  an  optimization  technique.  Later,  if  necessary,  the 
polynomial  degree  is  increased.  Lack  of  fit  of  polynomial  models  is  checked  by 
methods  of  statistical  analysis. 

In  defining  a linear  model,  coefficient  b0  and  all  linear  regression  coefficients  are 
calculated. 

k k 

Y = K+J2  fyX;  + byXiXj 

i i 


(2.58) 
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where: 

y is  response  value; 
b;  are  linear  regression  coefficients; 

b;j  are  regression  coefficients  of  double  factor  interactions. 

Accuracy  and  confidence  of  the  obtained  estimates  for  regression  coefficients 
depend  on  the  used  design  of  experiments.  Choice  of  the  design  of  experiments  has 
to  do  with  determining  the  number  of  experimental  points-trials  and  such  a distri- 
bution of  those  points  in  a factorial  space  that  facilitates  obtaining  the  necessary 
information  with  a minimal  number  of  design  points-trials.  When  selecting  the 
design  of  experiments  a design  matrix  or  a standard  type  table  is  constructed  where  all 
conditions  of  doing  design  points  that  are  part  of  the  chosen  design  are  defined. 
Mostly,  in  a design  matrix,  rows  correspond  to  different  design  points-trials  and  col- 
umns to  individual  factors.  Obtaining  a linear  model  has  to  do  with  performing  full 
factorial  experiment-FUFE  or  fractional  factorial  experiment- FRFE,  which  is  a definite 
part  of  FUFE.  FUFE  is  the  experiment  where  all  possible  combinations  of  levels  of 
factors  are  realized  and  experimental  results  are  processed  by  applying  statistical 
analysis.  The  number  of  FUFE  design  points-trials  is  determined  from  relation 
(2.20)  where  one  should  know  that  factors  are  varied  on  two  levels  in  FUFE.  FUFE 
is  therefore  called  the  design  experiment  of  the  type  2k.  In  the  case  of  the  large  num- 
ber of  factors  (lc),  FUFE  requires  a large  number  of  trials  (N=2k  ),  so  that  in  that  case 
FRFE  is  used  more  frequently.  The  FRFE  design  matrix  is  called  a fractional  replica. 
When  composing  FUFE  and  FRFE  matrices  coded  factor  values  are  used.  Coding 
factors  requires  linear  transformation  of  the  factor  space  coordinates  with  the  coor- 
dinate beginning  in  the  null  point  or  experimental  center  and  defining  the  coordinate 
axes  ratio  in  units  of  the  factor  variation  interval.  The  arithmetic  of  this  transforma- 
tion is  given  in  this  expression: 

x = Xi~Xm  (2.59) 

1 Ax 

where: 

X;  is  the  coded  value  of  the  i-th  factor  (nondimensional  magnitude); 

X;,  xi0  are  natural  or  the  real  factor  values,  their  current  and  null  values,  respectively; 
Ax(e)  is  the  natural  value  of  the  factor-variation  interval.  In  a design  matrix  when  we 
vary  factors  on  two  levels  (+1;  -1)  only  signs  (+;-)  exist. 

With  FRFE  we  differentiale  the  existence  of  regular  and  irregular  replicas.  A regular 
fractional  replica  is  obtained  from  a FUFE  matrix  by  dividing  it  into  a number  of 
parts  divisible  by  two  (2,  4,  6,  8,  16,  etc.).  The  obtained  replicas  are  respectively 
marked  by  1/2-replica  or  half-replica,  1/4-replica,  1/8-replica,  etc.  Irregular  replicas 
are  obtained  by  talcing,  for  example  3/4  and  5/8  of  the  FUFE  matrix. 

Let  us  now  consider  how  to  construct  FUFE  and  FRFE  matrices.  The  number  of 
trials  is  defined  in  the  first  column  of  the  FUFE  design  matrix.  The  next  column  is  a 
fictional  variable  (x0  =+l)  that  is  used  to  estimate  the  b0  free  member  in  the  regres- 
sion equation.  The  number  of  matrix  columns  that  corresponds  to  the  number  of 
factors.  Sometimes  columns  which  correspond  to  factor  interactions  are  also 
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obtained.  The  number  of  rows  equals  the  number  of  design  points-trials  and  is  de- 
termined as  N=2k.  An  example  of  FUFE  design  matrix  construction  is  shown  in 
Table  2.93.  Note  that  the  column  of  the  fictional  factor  is  often  dropped.  For  any  num- 
ber of  factors  k,  the  FUFE  design  matrix  is  constructed  so  that  the  matrix  for  (k-1) 
factors  is  repeated  twice;  first  for  the  lower-level  k factor  values,  and  then  for  the 
upper-level  ones.  Distribution  of  experimental  or  design  points-trials  in  factor  space 
for  FUFE  and  k=2;  k=3  is  shown  in  Figs.  2.32  and  2.33. 


Table  2.93  FUFE  design 


No.  trials 

Xo 

x. 

X2 

X3 

X4 

X5 

1 

+ 

- 

- 

- 

- 

- 

2 

+ 

+ 

- 

- 

- 

- 

3 

+ 

- 

+ 

- 

- 

- 

4 

+ 

+ 

+ 

- 

- 

- 

5 

+ 

- 

- 

+ 

- 

- 

6 

+ 

+ 

- 

+ 

- 

- 

7 

+ 

- 

+ 

+ 

- 

- 

8 

+ 

+ 

+ 

+ 

- 

- 

9 

+ 

- 

- 

- 

+ 

- 

10 

+ 

+ 

- 

- 

+ 

- 

11 

+ 

- 

+ 

- 

+ 

- 

12 

+ 

+ 

+ 

- 

+ 

- 

13 

+ 

- 

- 

+ 

+ 

- 

14 

+ 

+ 

- 

+ 

+ 

- 

15 

+ 

- 

+ 

+ 

+ 

- 

16 

+ 

+ 

+ 

+ 

+ 

- 

17 

+ 

- 

- 

- 

- 

+ 

18 

+ 

+ 

- 

- 

- 

+ 

19 

+ 

- 

+ 

- 

- 

+ 

20 

+ 

+ 

+ 

- 

- 

+ 

21 

+ 

- 

- 

+ 

- 

+ 

22 

+ 

+ 

- 

+ 

- 

+ 

23 

+ 

- 

+ 

+ 

- 

+ 

24 

+ 

+ 

+ 

+ 

- 

+ 

25 

+ 

- 

- 

- 

+ 

+ 

26 

+ 

+ 

- 

- 

+ 

+ 

27 

+ 

- 

+ 

- 

+ 

+ 

28 

+ 

+ 

+ 

- 

+ 

+ 

29 

+ 

- 

- 

+ 

+ 

+ 

30 

+ 

+ 

- 

+ 

+ 

+ 

31 

+ 

- 

+ 

+ 

+ 

+ 

32 

+ 

+ 

+ 

+ 

+ 

+ 
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As  the  figures  show,  points  of  the  design  22  are  given  by  the  square  apex  coordi- 
nates, and  design  points  of  the  design  23  by  the  cube  apex  coordinates.  Design 
points  for  k>3  are  distributed  in  an  analogous  way.  Regression  coefficients  of 
Eq.  (2.58)  are  determined  from  FUFE  outcomes.  For  k=3,  these  coefficients  are  de- 
termined: 

y = b o + b1X1  + b2X2  + b3X3  + buX xX2  + b13X3X3  + b23X2X3  + bU3X3X2X3 

(2.60) 

bi23  regression  coefficient  of  triple  interaction; 

bi2,bi3  ,b23  regression  coefficients  of  double  interactions; 
bi,b2,b3  linear  regression  coefficients; 

b0  free  member. 
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k=2;  N=22 

Figure  2.32  Distribution  of  points  in  factor  space 


x3 


k=3;  N=23 

Figure  2.33  Distribution  of  points  in  factor 
space 


It  has  already  been  said  that  based  on  the  magnitude  of  the  linear  regression  coef- 
ficients one  may  speak  about  the  strength  of  influence  of  associated  factors  on 
response.  The  higher  the  b;  value  of  the  associated  factor,  the  more  intensively  it 
affects  response.  The  sign  of  those  coefficients  has  to  be  accounted  for  too;  if  b;  has 
a positive  sign,  the  increase  of  the  associated  factor  causes  an  increase  in  response; 
on  the  contrary,  with  a negative  sign  of  the  linear  regression  coefficient  an  increase 
in  its  factor  value  causes  a decrease  in  the  optimization  parameter.  It  is  sometimes 
more  interesting  to  observe  the  effect  of  the  i-th  factor  (its  value  is  equal  to  2b;)  and 
that  way  estimate  the  effect  of  the  i-th  factor  on  response  as  it  changes  from  lower 
(-1)  to  upper  (+1)  level.  This  is  especially  applied  in  designs  that  include  qualitative 
factors.  The  total  number  of  all  possible  effects  that  may  be  calculated  from  FUFE 
corresponds  to  the  number  of  N design  points-trials.  The  number  of  linear  regres- 
sion coefficients  is  identical  to  the  number  of  k factors,  included  in  the  FUFE 
matrix.  To  determine  the  number  of  factor  interactions  we  use  the  formula: 
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cm  = fc(fc— l)(fc— 2)...(fc— m+1)  6 

Ix2x3x...xm 

where: 

k is  number  of  factors; 

m is  number  of  factors  in  interaction  or  an  interaction  order. 

When  analyzing  results  of  factorial  experiments  we  talk  about  main  effects  and  in- 
teraction effects.  Main  effects  are  factor  effects  and  they  are  the  difference  of  averaged 
response  for  two  levels  (+1;  -1)  for  the  associated  factor.  In  case  response  difference 
for  two  levels  of  factor  Xj  is  the  same  irrespetive  of  on  which  level  factor  X2  (exclud- 
ing experimental  error),  one  may  say  that  there  exists  no  interaction  between  factors 
Xi  and  X2  or  that  the  interaction  is  X!X2=0.  This  statement  may  be  graphically  pre- 
sented. Figures  2.34  and  2.35  show  interaction  between  factors  Xr  and  X2,  and  Fig. 

2.36  indicates  that  such  an  interaction  is  nonexistent. 


X1  + 


X1 


Figure  2.34  Response  values  indicate  existence  Figure  2.35  Response  values  indicate  existence 
of  interaction  X,X2  of  interaction 


X1 


Figure  2.36  Response  values  indicate  nonexis- 
tence of  interaction  XnX2 
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It  is  frequently  the  case  in  practice  that  certain  interactions  are  statistically  unim- 
portant and  those  are  mostly  interactions  of  a higher  order  (triple,  quadruple,  etc.). 
This  property  of  higher-order  interactions  has  precisely  been  used  for  constructing 
fractional  replicas.  A fractional  design  is  obtained  so  that  statistically  unimportant 
interactions  are  replaced  in  a FUFE  column  design  by  a new  factor.  Here,  it  should 
be  pointed  out  that  when  forming  a fractional  replica,  FUFE  rows  are  not  mechani- 
cally divided.  A FUFE  design  may  not  be  simply  divided  into  two  parts  when  we 
form  a half-replica.  Fractional  replicas  or  designs  must  be  formed  in  accord  with  the 
rule  of  design  saturation.  A saturated  design  denotes  the  replica  that  is  obtained 
when  replacing  all  interaction  effects  with  linear  effects  of  new  factors  so  that  the 
degree  of  freedom  f=0.  The  experiment  in  this  case  involves  a minimal  number  of 
design  points,  and  the  outcomes  may  result  in  erroneous  conclusions  if  the  linear 
model  is  inadequate  and  if  interaction  effects  significantly  affect  the  linear-effect 
estimates.  A check  of  lack  of  fit  of  linear  models,  obtained  from  saturated  designs,  is 
not  feasible  as  the  degree  of  freedom  is  f =0. 

Taking  all  this  into  consideration,  unsaturated  designs  (f>0)  or  special  designs, 
which  include  the  influence  of  interaction  effects  on  linear-effect  estimates,  are  used 
in  practice.  An  oversaturated  design  (f<0)  was  used  in  Example  2.12  as  a random 
balance  method  design,  but  a totally  different  problem  was  being  solved  in  that  case. 

To  compare  estimates  of  efficiency  of  fractional  replicas,  a special  criterion  called 
power  of  solving  a replica  is  used.  This  criterion  includes  a number  of  linear  effects 
that  have  not  been  aliased/confounded  in  the  given  design.  In  the  case  of  aliased/ 
confounded  effects,  we  obtain  aliased/confounded  regression  coefficients  when  pro- 
cessing experimental  outcomes,  which  simultaneously  characterize  both  linear  and 
interaction  effects.  By  mixing  the  effects  on  this  way,  the  number  of  design  points- 
trials  is  reduced  but  at  the  same  time  the  analysis  of  experimental  outcomes  is  com- 
plicated. From  the  same  FUFE  design  one  may  get  a replicas  with  different  degrees 
of  effect  mixing  or  a different  power  of  solving  them.  A researcher,  however,  should, 
in  principle,  try  to  find  the  replica  with  the  highest  possible  power  of  solving  it.  This 
is  most  often  achieved  when  new  linear  effects  are  aliased/confounded  with  highest- 
order  interactions.  This  is  exactly  the  principle  used  in  fractional  replicas  in 
Example  2.12  to  form  the  oversaturated  design  of  the  random  balance  method. 
Namely,  a replacement  of  triple  interactions  by  a new  factor  X3X2X3=X4  has  been 
done  in  that  design.  Higher-order  interactions  are  replaced  by  new  linear  effects 
because  of  the  fact  that  we  may  assume,  with  a high  level  of  confidence,  that  those 
interactions  are  less  significant  than  lower-order  interactions.  Simultaneously,  a 
probability  of  obtaining  nonaliased/confounded  linear  effects  increases.  An  estimate 
of  the  power  of  solving  a replica  is  additionally  a complicated  case  of  several  factorial 
experiments.  To  alleviate  the  problem  two  new  terms  are  introduced:  generating  ratio 
and  defining  contrast. 

Generating  ratio  is  the  term  that  indicates  the  effect  with  which  the  new  effect  is 
aliased/confounded.  Actually,  when  replacing  a triple  interaction  X3X2X3  by  X4  fac- 
tor, the  generating  ratio  has  the  form  X3X2X3=X4.  By  increasing  the  number  of  sym- 
bols in  a generating  ratio  the  power  of  solving  a replica  is  increased.  There  is  a high- 
er power  of  solving  a replica  in  the  example  XjX2X3=X4  than  in  the  case  of  X3X2=X4. 
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Defining  contrast  is  obtained  by  multiplying  the  generating  ratio  by  its  associated 
factor.  The  defining  contrast,  the  case,  X1X2X3=X4  has  the  form:  1=X1X2X3X4. 

The  given  ratio  helps  to  determine  aliased/confounded  effects.  For  this,  it  is  nec- 
essary to  multiply  successively  both  sides  of  the  defining  contrast  by  factors  from 
matrix  columns.  Factor  X4  is  in  this  case  obtained: 

X4=X1X2X3X42=X1X2X3 

This  system  of  mixing  effects  may  conveniently  be  written  as  regression  coeffi- 
cients. For  factor  X4  in  this  case  we  obtain: 

b4=|34+Pi23 

For  other  factors  and  interactions  we  get: 

\ = Pi  + PJ34  > ^2  = P2  + P134  ; h3  = P3  +Pl2V 

^12  P32  P34’  ^13  P33  P24  ’ ^23  P23  P34 

Fractional  replicas  may  also  be  observed  as  designs  of  the  type  2k'p  , where  p is 
the  number  of  linear  effects  aliased/confounded  with  interaction  effects.  As  has 
been  said  for  construction  of  random  balance  design,  fractional  replica  24'1  has  been 
used  in  Example  2.12.  Design  of  type  23'1  means  two  replicas  as  defined  by  two  gen- 
erating ratios  X3X2=X3  and  -X3X2=X3.  For  the  first  half-replica,  factor  X3  replaces  in 
the  design  matrix  the  column  that  corresponds  to  X3X2,  Table  2.94. 

Table  2.94  Fractional  factorial  design  23  1 


No.  trials  X0  X,  X2  X3=X,X2 


1 + + + + 

2 + — + — 

3 + + 

4 + - - + 


Signs  in  the  interaction  column  are  obtained  by  simply  multiplying  the  columns 
of  associated  factors. 

By  multiplying  the  chosen  generating  ratio  with  new  factor  X3  we  obtain  the 
defining  contrast:  1=X3X2X3. 

It  is  then  multiplied  by  each  factor  from  the  design  23'1.  If  the  given  yields  offer 
the  square  of  the  factor,  it  is  automatically  replaced  by  the  number  one.  Aliased/con- 
founded  effects  for  the  observed  half-replica  are  given  by  these  ratios: 

X3=X2X3;  X2=XiX3;  X3=X3X2 

This  means  that  regression  coefficients  will  be  estimated  as  these  aliased/con- 
founded  effects: 


bi-Pi+P23; 


b2=p2+Pi3; 


b3-P3+Pi2; 
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To  illustrate  this  let  us  observe  the  FUFE  25.  In  this  case  we  obtain  a half-replica 
of  type  25'1,  as  given  by  the  generating  ratio  X5=X1X2X3X4.  The  associated  defining 
contrast  is:  1=X1X2X3X4X5,  and  aliased/confounded  estimates  are  defined  by  these 
ratios: 


X1  = X2X3X4X5; 
X2  = X1X3X4X5; 
X3  = X1X2X4X5; 
X4=X1X2X3X5; 
X5  = X3X2X3X4; 


X3X2  =X3X4X5; 
X3X3  = X2X4X5; 
X3X4  = X2X3X  ; 
X3X5  = X2X3X4; 
X2X3  =X1X4X5; 


X2X4=X1X3X5; 
X2X5  =X1X3X4; 
X3X  = X,  X2X3 ; 
X3X3  = X1X2X4; 
X4X5  = X3X2X3 ; 


Now  consider  the  FUFE  1/16-replica  for  eight  factors.  In  this  case  the  design  of 
type  28'4  is  defined  by  four  generating  ratios: 


X5=X1X2X3X4;  X6=X3X2X3; 

X7=X1X2X4;  X8=X1X3X4; 


The  design  matrix  is  shown  in  Table  2.95. 

Table  2.95  Fractional  factorial  design  2*  4 


No. 

trials 

Design  matrix 

Response 

x„ 

Xr 

X2 

x3 

x4 

X5 

X6 

X7 

X8 

Yu 

1 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

18.1 

2 

+ 

+ 

+ 

- 

+ 

- 

- 

- 

+ 

32.1 

3 

+ 

+ 

- 

+ 

+ 

- 

- 

+ 

- 

19.5 

4 

+ 

+ 

- 

- 

+ 

+ 

+ 

- 

- 

20.5 

5 

+ 

- 

+ 

+ 

+ 

- 

- 

+ 

+ 

15.0 

6 

+ 

- 

+ 

- 

+ 

+ 

+ 

- 

+ 

25.8 

7 

+ 

- 

- 

+ 

+ 

+ 

+ 

+ 

- 

24.0 

8 

+ 

- 

- 

- 

+ 

- 

- 

- 

- 

43.8 

9 

+ 

+ 

+ 

+ 

- 

- 

+ 

- 

- 

25.0 

10 

+ 

- 

- 

- 

- 

+ 

- 

+ 

+ 

52.0 

11 

+ 

- 

+ 

+ 

- 

+ 

- 

- 

- 

42.6 

12 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

+ 

29.2 

13 

+ 

+ 

- 

+ 

- 

+ 

- 

- 

+ 

25.0 

14 

+ 

- 

+ 

- 

- 

- 

+ 

+ 

- 

55.0 

15 

+ 

+ 

+ 

- 

- 

+ 

- 

+ 

- 

45.0 

16 

+ 

- 

- 

+ 

- 

- 

+ 

- 

+ 

34.0 

Defining  contrasts  are: 


1=X1X2X3X4X5;  1=X1X2X3XS; 

1=X1X2X4X7;  1=X|X3X4XS; 
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By  multiplying  defining  contrast  (two,  three  or  four)  we  obtain  the  general  defin- 
ing contrast: 

l=XIX2XiX4X5=XlX2X,X(-XlX2X4X7=X1X!X4Xs=X4X-,Xf-X3XsX7=X2X-,X8= 

=X3X4X6X7=X2X4X6X8=X2X3X7X8=X1X2XsX6X7=X1X3X5XsX8=X1XsX7X8= 

=X1X4X5X7X8=X2X3X4XSX6X7X8. 

By  neglecting  all  higher-order  interactions  starting  with  triple  ones,  we  obtain 
these  estimates  of  regression  coefficients: 

bi=(31;  b2=p2+(358;  b3=p3+p57; 

b4=P4+P56;  b5=P5+P46+p37+P28;  t>6=P6+P4S; 

b7=p7+p3s;  b8=p8+p25;  bi2=Pi2+p38+p47; 

bi3=Pi3+P26+P48;  b14=pl4+p27+p38;  b23=p23+pl6+p78; 

b24=P24+Pi7+PSg;  b34=p34+pl8+p67;  b15=pl5. 

Depending  on  interaction  significance,  one  experimental  design  may  be  used  in 
different  cases.  Thus,  a design  of  experiments  with  eight  design  points-trials  may  be 
used  [23]: 

• in  the  case  of  three  factors,  to  calculate  all  main  effects  and  all  interactions 
(FUFE); 

• in  the  case  of  four  factors,  to  calculate  main  effects  and  two-factor  interac- 
tions between  three  out  of  all  in  all  four  factors,  hereby  neglecting  all  other 
interactions; 

• in  the  case  of  five  factors,  to  calculate  main  effects  and  two  two-factor  interac- 
tions; 

• in  the  case  of  six  factors,  to  calculate  main  effects  and  one  two-factor  interac- 
tion; 

• in  the  case  of  seven  factors,  to  calculate  main  effects,  neglecting  all  interac- 
tions. 

Now  consider  a fractional  replica  of  type  215'11,  which  is  the  1/2048-replica  of  a 
FUFE.  It  is  pointless  in  this  case  to  write  down  all  aliased/confounded  estimates  as 
their  number  is  enormous.  As  an  example,  linear  effects  are  aliased/confounded 
with  105  even  interactions.  The  design  matrix  of  215'11  is  shown  in  Table  2.96. 

When  forming  the  design  matrix  of  an  experiment  it  is  transformed  into  an  opera- 
tional matrix  by  replacing  coded  values  with  associated  real-actual,  dimensional  val- 
ues. The  experiment  is  done  based  on  an  operational  matrix.  When  we  obtain 
experimental  values  or  responses,  we  again  refer  to  the  design  matrix,  which  is  then 
completed  into  arithmetic  matrix,  by  addition  of  columns  associated  with  interac- 
tions that  are  of  interest.  Furthermore,  regression  coefficients  are  calculated  by  the 
method  of  least  squares  as  a special  method  of  regression  analysis.  Linear  regression 
coefficients  are  calculated  by  these  formulas: 

£ xiu><Yu  £xil4xy.. 

£4 

i 


N 


(2.62) 
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Table  2.96  Fractional  factorial  design  215  11 


No.  of  trials 

x. 

X2 

X, 

X4 

X5 

X6 

X7 

X8 

x9 

© 

>< 

Xn 

Xn2 

X13 

x14 

^15 

1 

- 

- 

- 

- 

+ 

+ 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

+ 

2 

+ 

- 

- 

- 

- 

- 

- 

+ 

+ 

+ 

+ 

+ 

+ 

- 

- 

3 

- 

+ 

- 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

+ 

- 

+ 

- 

4 

+ 

+ 

- 

- 

+ 

- 

- 

- 

- 

+ 

- 

- 

+ 

+ 

+ 

5 

- 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

+ 

- 

6 

+ 

- 

+ 

- 

- 

+ 

- 

- 

+ 

- 

- 

+ 

- 

+ 

+ 

7 

- 

+ 

+ 

- 

- 

- 

+ 

+ 

- 

- 

- 

+ 

+ 

- 

+ 

8 

+ 

+ 

+ 

- 

+ 

+ 

- 

+ 

- 

- 

+ 

- 

- 

- 

- 

9 

- 

- 

- 

+ 

+ 

+ 

- 

+ 

- 

- 

- 

+ 

+ 

+ 

- 

10 

- 

+ 

- 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

11 

+ 

+ 

- 

+ 

+ 

- 

+ 

- 

+ 

- 

- 

+ 

- 

- 

- 

12 

- 

- 

+ 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

+ 

- 

- 

+ 

13 

+ 

- 

+ 

+ 

- 

+ 

+ 

- 

- 

+ 

- 

- 

+ 

- 

- 

14 

- 

+ 

+ 

+ 

- 

- 

- 

+ 

+ 

+ 

- 

- 

- 

+ 

- 

15 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

16 

+ 

- 

- 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

- 

- 

+ 

+ 

where: 

Xiu  is  coded  X;  factor  value  in  the  u-th  design  point-trial; 
yu  is  response  average  in  the  u-th  design  point-trial; 

N is  total  number  of  design  points-trials  in  design  matrix; 
u is  current  number  of  design  points-trials. 

The  value  of  the  free  member  in  a regression  equation  (b0)  is  determined  from 
the  relation: 

N 

E Yu 

b0=Aw-  (2.63) 


Regression  coefficients  of  two-factors  interactions  are  determined  thus: 


yx.  X.  y„  yx.  X.  vu 

' iu  julu  M juIU 


*S=-N 


IU  JU1 


EK 

1 


N 


(2.64) 


After  obtaining  regression  coefficient  values,  both  their  statistical  significance  and 
lack  of  fit  of  the  obtained  regression  model  are  checked. 


2. 3. 1.1  Yates  Method 

Section  1.5  dealt  with  the  analysis  of  variance  method,  which  may  easily  be  applied 
for  FUFE  analysis  of  results.  Hereby  one  should  take  care  to  transform  the  FUFF 
design  arithmetic  matrix  into  the  table,  which  is  required  by  the  analysis  of  variance 
notation.  One  should  also  keep  in  mind  the  difference  in  processing  designs  with 
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and  without  replications  of  design  points-trials.  The  classical  procedure  of  process- 
ing analysis  of  variance  outcomes  becomes  much  more  complex  with  an  increase  in 
the  number  of  factors.  To  facilitate  a routine  processing  of  FUFE  designs  with  a 
large  number  of  factors,  Yates  [27,  28]  has  developed  a simple  procedure  of  present- 
ing and  processing  outcomes.  To  simplify  matters  these  signs  are  introduced:  fac- 
tors are  labelled  by  Latin  capitals,  design  points-trials  with  all  its  factors  in  lower  lev- 
el (1),  and  other  design  points-trials  by  small  letters  for  factors  in  the  upper  levels. 

As  example,  for  23  FUFE,  symbols  for  experimental  conditions  are  shown  in 
Table  2.97. 

Table2.97  FUFE  23  symbols 


Trial 

symbols 

Factor  symbols 

A 

B 

C 

(1) 

- 

- 

- 

a 

+ 

- 

- 

b 

- 

+ 

- 

ab 

+ 

+ 

- 

c 

- 

- 

+ 

ac 

+ 

- 

+ 

be 

- 

+ 

+ 

abc 

+ 

+ 

+ 

In  accordance  with  former  notation,  symbols  in  columns  denote  variation  levels 
of  associated  factors  A,  B and  C. 

A small  letter  in  a design  point-trial  sign  means  that  the  factor  it  represents  in  a 
corresponding  design  point-trial  is  set  at  a higher  level,  while  lack  of  small  letters 
means  that  the  associated  factors  are  in  the  lower  level.  The  sign  (1)  means  that  all 
factors  are  in  the  lower  level;  a design  point-trial  marked  by  (a)  indicates  that  factor 
A is  in  the  upper  and  factor  B and  C in  the  lower  level,  etc. 

In  accord  with  calculations  by  the  least  square  method,  the  factor  A affect  is  the 
difference  of  average  design  point-trials  or  response  values  in  upper  and  lower  lev- 
els: 

A = l/4(o  + ab  + ac  + abc)  — 1/4((1)  + b + c + be)  (2.65) 

By  treating  (1),  a,  b and  c as  algebraic  expressions  we  get: 

A=l/4(a-l)(b+l)(c+l)  (2.66) 

By  analogy  we  obtain: 

A=l/4(a-l)(b+l)(c+l) 

B=l/4(a+l)(b-l)(c+l) 

C=1  /4(a+l)  (b+1)  (c-1) 

AB=l/4(a-l)(b-l)(c+l) 


(2.67) 
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AC=l/4(a-l)(b+l)(c-l) 

BC=1  /4(a+l)  (b-1)  (c-1) 

ABC=l/4(a-l)(b-l)(c-l) 

The  Yates  algorithm  is  easily  generalized  for  any  2k  FUFE.  For  k factor  A,  B,  C, 

Q,  each  on  two  levels,  the  associated  expressions  are: 

A=(l/2)k  1(a-l)(b+l)(c+l)...(q+l) 

AB=(l/2)k'1(a-l)(b-l)(c+l)...(q+l)  (2.68) 

ABC...Q=(l/2)k"1(a-l)(b-l)(c-l)...(q-l) 

The  equation  for  calculating  effect  A in  its  developed  form  may  be  written  as: 

4A=-l+a-b+ab-c+ac-bc+abc  (2.69) 

The  structure  of  the  equation  for  determining  an  effect  and  sequence  of  design 
points-trials  is  the  same,  so  that  an  effect  may  be  directly  determined  from  design 
point-trial  or  response  outcomes  by  adding  associated  signs: 

-+-+-+-+ 

Other  effects  and  interactions  may  be  expressed  in  a completely  identical  way. 
Table  2.98  shows  Yates  notation  and  processing  for  22,  23,  24,  25,  FUFE.  The  given 
procedure  for  calculating  factor  and  interaction  effects,  as  well  as  a specific  aspect  of 
two-level  designs  facilitates  a rather  simple  method  of  processing  results.  As  the 
number  of  degrees  of  freedom  of  factor  and  interaction  effects  is  one,  the  sum  of 
squares  is  identical  to  the  variance  estimate.  Since  factor  effects  are  obtained  from 
design  point-trial  or  response  outcome  differences  in  upper  and  lower  levels,  and 
the  experimental  design  is  done  in  only  two  levels,  it  is  possible  to  calculate  the  rela- 
tionship between  the  sum  of  squares,  or  an  estimate  on  variance  and  calculated 
effects: 

Sum  of  squares  SS  =2k‘2  (effect)2  (2.70) 

Hence,  based  on  calculated  effects  of  individual  factors  and  interactions  from  rela- 
tion (2.70),  we  obtain  a variance  estimate,  which  is  further  analyzed  by  the  analysis 
of  variance  method.  Here,  one  should  remember  once  again  the  difference  between 
two  cases  in  estimating  residual  variance: 

• full  factorial  designs  with  no  trial  replications, 

• full  factorial  designs  with  trial  replications. 

One  can  often  find  tabular  presentation  of  FRFE  in  Anglo-Saxon  reference  litera- 
ture. Let  us  keep  here  the  Yates  notation  for  seven  factors  A,  B,  C,  D,  E,  F and  G, 
which  are  all  varied  on  two  levels.  According  to  FUFE  such  a design  includes  27=128 
design  points-trials  that  are  given  geometrically  in  Table  2.99.  Shaded  cells  in  the 
same  table  give  a half-replica  with  27_1=64  design  points-trials  that  belong  to  a 27'1 
fractional  factorial  experiment.  Associated  geometric  interpretations  of  a 1/4-replica 
and  of  a 1/8-replica  are  shown  in  Tables  2.100  and  2.101. 
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Table  2.102  offers  a researcher  several  very  useful  two-level  fractional  factor 
designs  with  effects  that  can  be  estimated  (under  assumptions  that  three-factor  and 
multifactor  interactions  are  negligible).  The  design  of  experiments  matrix  consist  of 
trials  that  are  given  for  each  FRFE  but  in  a completely  random  sequence. 


Table  2.98  Main  effects  and  interactions  for  22 ; 23 ; 2* ; 25  FUFE  Yates  notation 


T 

A 

B 

A 

C 

A 

B 

A 

D 

A 

B 

A 

C 

A 

B 

A 

E 

A 

B 

A 

C 

A 

B 

A 

D 

A 

B 

A 

C 

A 

B 

A 

B 

C 

C 

B 

D 

D 

B 

D 

C 

C 

B 

E 

E 

B 

E 

C 

C 

B 

E 

D 

D 

B 

D 

C 

C 

B 

C 

D 

D 

D 

C 

E 

E 

E 

C 

E 

E 

D 

E 

D 

D 

C 

D 

E 

E 

E 

E 

D 

E 

(1) 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

a 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

b 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

Ab 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

- 

- 

- 

- 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

c 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

- 

+ 

+ 

- 
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Table  2.99  Fractional  factorial  design  2?  1 


A(-) 


A(+) 


B(-)  B(+)  B(-)  B(+) 


C(-)  C(+)  C(-)  C(+)  C(-)  C(+)  C(-)  C(+) 

D(-)  D(+)  D(-)  D(+)  D(-)  D(+)  D(-)  D(+)  D(-)  D(+)  D(-)  D(+)  D(-)  D(+)  D(-)  D(+) 


E 

(') 


E 

(+) 


F G(-) 
(-)  G(+) 
F G(-) 
(+)  G(+) 
F G(-) 
(-)  G(+) 
F G(-) 
(+)  G(+) 


Table  2.100 


Fractional  factorial  design  2 


7-2 


A(-) 


A(+) 


E 

(-) 


E 

(+) 


F G(-) 
(-)  G(+) 
F G(-) 
(+)  G(+) 
F G(-) 
(-)  G(+) 
F G(-) 
(+)  G(+) 
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Table  2.101  Fractional  factorial  design  2?  3 


A(-)  A(+) 


B(-) 

B(+) 

B(-) 

B(+) 

C(-)  C(+) 

C(-)  C(+) 

C(-)  C(+) 

C(-)  C(+) 

D(-)  D(+)  D(-)  D(+)  D(-)  D(+)  D(-)  D(+)  D(-)  D(+)  D(-)  D(+)  D(-)  D(+)  D(-)  D(+) 


E 

(■) 


E 

M 


F G(-) 
(-)  G(+) 
F G(-) 
(+)  G(+) 
F G(-) 
(-)  G(+) 
F G(-) 
(+)  G(+) 


Example  2.26  [29] 

In  Example  2.12,  the  method  of  random  balance,  factors  have  been  selected  by  the 
effects  of  their  significance  on  dynamic  viscosity  of  uncured  composite  rocket  pro- 
pellant. The  screened-out  factors  are:  X3  mixing  speed;  X5  time  after  addition  of  AP 
and  X8  vacuum  in  vertical  planetary  mixer.  Since  insufficient  vacuum  in  a mixer 
causes  bubbles  to  appear  in  the  cured  propellant,  the  value  of  this  factor  is  fixed  at 
the  most  convenient  one.  For  the  other  two  factors  a design  of  basic  experiment  has 
been  done  according  to  a FUFF  matrix,  as  shown  in  Table  2.103,  and  aimed  at 
obtaining  the  mathematical  model  of  viscosity  change. 


Table  2.102  Fractional  factorial  designs  Design  Trials  Effects  Design  Trials  Effects 
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Table  2.103  Full  factorial  experiment  22 


Name  Mixing  speed  Time 

X]  min"1  x2  min 


Basic  level 

60 

95 

Variation 

20 

85 

interval 

Upper  level 

80 

180 

Lower  level 

40 

10 

Trials  Design  matrix  Oper.  matrix  Response-viscosity  p 


Xn 

X2 

*i 

*2 

Yui 

Xu2 

Yu3 

Xu4 

Yus 

Yu6 

Yu 

1 

- 

- 

40 

10 

1182.4 

1139.2 

1136.5 

1209.4 

1134.6 

1159.2 

1160.2 

2 

+ 

- 

80 

10 

622.4 

660.8 

631.8 

602.1 

668.6 

645.6 

638.5 

3 

- 

+ 

40 

180 

683.2 

624.0 

682.6 

699.5 

565.4 

611.2 

644.3 

4 

+ 

+ 

80 

180 

496.0 

486.2 

495.6 

513.6 

450.9 

467.2 

484.9 

Sum  2927.54 


Regression  coefficients  are  determined  in  accordance  with  relations  (2.62)  - 
(2.64): 

N 

7,1.98 


N 4 

N 

EXluxyu 

h = A 


-1160.2+638.5-644.3+484.9 


N 

N 

E X2uxyu 


\h=±- 


-1160.2-638.5+644.3+484.9 


N 


= -170.28 


= -167.38 


hu  — 


V X ' xX,  xyu 

i lM  _ 1160.2-638.5-644.3+484.9 


= 90.58 


N 4 

The  mathematical  model  or  linear  regression  equation  has  the  form: 
y = 731.98  - 170.28Xj  - 167.38X2  + 90.58XJX2 

Since  the  relation  between  actual  and  coded  factors  is  given  by  expressions: 

x1  —60  . x?— 95 

1 20  ’ 2 85  ’ 

the  regression  equation  with  actual  factors  is: 

y = 731.98  - 170.28EZ60  _ 167.38^'  95  • 90.58 %1  60 95 
1 20  85 


85 


20  85 
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Example  2.27  [30] 

It  is  necessary  to  do  mathematical  modeling  of  effects  ofX3  dyeing  time,  X2  dye  con- 
centration and  X3  temperature  of  dyeing  on  efficiency  of  dyeing.  For  this,  the  FUFF 
design  from  Table  2.93  has  been  used  for  the  three  mentioned  factors.  Formula 
(2.59)  has  been  used  in  coding  the  factors,  with  basic  level  and  variation  intervals 
being  taken  into  account: 


X, 


x1— 10  v x2— 1.0  v %3—  85 

5 ’ 2 “ 0.5  ’ 3 “ 10  ’ 


The  design  matrix  with  operational  matrix  and  outcomes  of  design  points-trials  is 
given  in  Table  2.104.  Note  that  design  points-trials  have  been  replicated  so  that  the 
table  gives  response  means. 


Table  2.104  Full  factorial  experiment  23 


No. 

trials 

Design  matrix 

Operational-matrix 

Response 

E Yu 

x0 

x, 

x2 

x3 

x,x2 

x,x3 

X2X3 

X,X2X3 

Xjmin 

x2% 

x3°C 

1 

+ 

- 

- 

- 

+ 

+ 

+ 

- 

5 

0.5 

75 

81.08 

2 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

15 

0.5 

75 

85.65 

3 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

5 

1.5 

75 

82.27 

4 

+ 

+ 

+ 

- 

+ 

- 

- 

- 

15 

1.5 

75 

90.40 

5 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

5 

0.5 

95 

84.95 

6 

+ 

+ 

- 

+ 

- 

+ 

- 

- 

15 

0.5 

95 

89.95 

7 

+ 

- 

+ 

+ 

- 

- 

+ 

- 

5 

1.5 

95 

85.25 

8 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

15 

1.5 

95 

88.25 

Regression  coefficients  of  linear  regression  have  these  values: 

b0=85.98;  b1=2.60;  b2=0.54;  b3=1.13; 

b12=0.20;  b13=-0.59;  b23=-0.92j  b323=-0.70 

and  the  linear  regression  has  the  form: 

y = 85.98  + 2.60Xj  + 0.54X2  + 1.13X,  + O.IOX^  - 0.59X1X3  - 0.92X2X3 
-0.70XjX2X3 


Example  2.28  [21] 

This  example  analyzes  lab  research  of  yields  in  a nitration  process,  which  gives  the 
basic  product  for  medicine  and  dye  industries.  Three  factors  assumed  to  have  effects 
on  yield  in  the  nitration  process  have  been  researched: 

1.  A nitric  acid  dosing  time,  h; 

2.  B mixing  time,  h; 

3.  C factor  of  mixing  remnants  from  previous  batch. 
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These  factor  levels  have  been  used  in  the  experiment: 


Factors 

Lower  level 

Upper  level 

A 

2 

7 

B 

0.5 

4 

C 

with  no  mixture  remnants 

with  mixture  remnants 

Do  the  outcome  analysis  by  classical  and  Yates  processing.  The  measurement 
results  are  in  Table  2.105. 


Table  2.105  Full  factorial  experiment  23 


No. 

Yates 

Design  matrix 

Operational  matrix 

Response 

T 

A 

B 

AB 

C 

AC 

BC 

ABC 

A 

B 

C 

Yu 

1 

(1) 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

2 

0.5 

WITHOUT 

87.2 

2 

a 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

7 

0.5 

WITHOUT 

88.4 

3 

b 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

2 

4 

WITHOUT 

82.0 

4 

ab 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

7 

4 

WITHOUT 

83.0 

5 

c 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

2 

0.5 

WITH 

86.7 

6 

ac 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

7 

0.5 

WITH 

89.2 

7 

be 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

2 

4 

WITH 

83.4 

8 

abc 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

7 

4 

WITH 

83.7 

According  to  Eq.  (2.67)  we  get: 

A=l/4(a+ab+ac+abc)-l/4((l)+b+c+bc)=(88.4+83.0+89.2+83.7)/ 

4-(87. 2+82. 0+86. 7+83.4) /4 

A=1.25 

B=l/4(b+ab+bc+abc)-l/4((l)+a+c+ac)=(82. 0+83.0+83.4+83. 7)/ 

4-(87. 2+88.4+86. 7+89. 2) /4 

B=-4.85 

C=l/4(c+ac+bc+abc)-l/4((l)+a+b+ab)=(86. 7+89. 2+83. 4+83. 7)/ 

4-(87. 2+88.4+82. 0+83.0) /4 

C=0.60 

AB=l/4((l)+ab+c+abc)-l/4(a+b+ac+bc)=(87. 2+83. 0+86.7+83. 7)/ 

4-(88. 4+82. 0+89. 2+83.4) /4 

AB=-0.60 

AC=1  /4((l)+b+ac+abc)-l /4(a+ab+c+bc)=(87. 2+82. 0+89. 2+83. 7)  / 

4-(88. 4+83. 0+86. 7+83.4) /4 

AC=0.15 

BC=l/4((l)+a+bc+abc)-l/4(b+ab+c+ac)=(87. 2+88. 4+83.4+83. 7)/ 

4-(82. 0+83. 0+86. 7+89. 2) /4 

BC=0.45 

ABC=l/4(a+b+c+abc)-l/4((l)+ab+ac+bc)=(88. 4+82. 0+86. 7+83. 7)/ 

4-(87. 2+83. 0+89. 2+83.4) /4 

ABC=-0.50; 

T=((l)+a+b+ab+c+ac+bc+abc)=683.60. 
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Since  the  effects  are  doubled  in  comparison  with  regression  coefficients,  it  fol- 
lows: 

b0=T/8=85.45;  bA=A/2=0.63;  bB=B/2=-2.43;  bc=C/2=0.30; 

bAB=AB/2=-0.30;  bAC=AC/2=0.075;  bBC=BC/2=0.225;  bABC=ABC/2=-0.25. 

To  demonstrate  the  relation  between  effects  and  estimates  of  associated  variances, 
we  will  transform  Table  2.105  into  Table  2.106. 


Table  2.106  Analysis  of  variance 


Factors 

q- 

) 

C(+) 

A(-) 

A(+) 

A(-) 

A(+) 

B(-) 

B(+) 

(1)  87.2 
b 82.0 

a 88.4 
ab  83.0 

c 86.7 
be  83.4 

ac  89.2 
abc  83.7 

Variance  estimates  based  on  f=l  degrees  of  freedom  are  calculated  from  effects 
from  Eq.  (2.70): 

MSA=2k'2  x A2=23'2  x 1.252  =3.125;  MSB=2(-4.85)2=47.045; 

MSC=2  x 0.62=0.720;  MSab=2(-0.6)2=0.720;  MSac=2  x 0.152=0.045; 

MSbc=2  x 0.452=0.405;  MSabc=2(-0.5)2=0.500. 

Do  not  forget  that  the  experiment  has  been  done  with  no  replication  of  trial,  so 
that  residual  variance  must  be  determined  based  on  the  interaction  variances.  To 
check  whether  all  interactions  may  be  replaced  or  some  variance  interactions  not  be 
included  into  the  residual  variance,  (as  its  effect  is  important  and  different  from 
others)  use  the  Bartlett  criterion,  as  shown  in  Sect.  1.5.  By  comparing  the  analysis  of 
variance  outcomes  from  Problem  1.34  with  values  obtained  here,  it  is  evident  that 
completely  identical  outcomes  are  in  question. 

Example  2.29  [21] 

In  lab  studies  of  isatin  yield,  conditions  of  the  technological  procedure  in  producing 
this  product  from  isonitrozoacetylamine  have  been  tested.  The  effects  of  three  pro- 
cess factors  have  been  assessed. 

Lower  level  Upper  level 

A concentration  of  basic  raw  material,  % 87  93 

B duration  of  reaction,  min  15  30 

C temperature  of  reaction,  °C  60  70 

Based  on  previous  testing  of  the  research  subject,  the  design  of  the  full  factorial 
experiment  23  with  one  replication  to  determine  experimental  error  has  been  cho- 
sen. To  eliminate  the  influence  of  systematic  error  in  doing  the  experiment,  the 
sequence  of  doing  design  point-trials,  in  accord  with  theory  of  design  of  experi- 
ments, has  been  completely  random.  The  outcomes  are  given  in  Table  2.107. 


Table  2.107  Full  factorial  experiment  23 
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No.  Yates 


Design  matrix 


Operational  Response 

matrix 


T 

A 

B 

AB 

C 

AC 

BC 

ABC 

A 

B 

C 

Yui 

Yu2 

EYu 

1 

(1) 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

87 

15 

60 

6.08 

6.31 

12.39 

2 

a 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

93 

15 

60 

6.04 

6.09 

12.13 

3 

b 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

87 

30 

60 

6.53 

6.12 

12.65 

4 

ab 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

93 

30 

60 

6.43 

6.36 

12.79 

5 

c 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

87 

15 

70 

6.79 

6.77 

13.56 

6 

ac 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

93 

15 

70 

6.68 

6.38 

13.06 

7 

be 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

87 

30 

70 

6.73 

6.49 

13.22 

8 

abc 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

93 

30 

70 

6.08 

6.23 

12.31 

According  to  Eq.  (2.67)  we  get: 

A=-0.191;  B=-0.021;  C=0.274;  AB=-0.001; 

AC-0.161;  BC— 0.251;  ABC-0.101;  T=51.055. 

The  sum  of  squares  of  effects  and  interactions  or  mean  square  (f=l),  is  obtained 
from  Eq.  (2.70): 

MSA=2k  ln(EJect)2  = 2 x 2(-0.191)2=0.146 

MSB=0.002;  MSc=0.300;  MSab=0.000;  MSac=0.104;  MSbc=0.253;  MSabc=0.041; 

N n 

EE  (yMi-y,.)2 

SSB  = 1 1 - = 0.200;  MSe  =SSe/N=0. 200/8=0.025. 

Note  that  in  preliminary  calculations  the  sum  of  replicated  design  points-trials  is 
taken  as  the  response,  and  thus  the  number  of  replicated  design  points  n is  intro- 
duced Eq.  (2.70).  As  there  exists  replication  of  trials,  it  is  evident  that  the  error  sum 
of  squares  is  calculated  in  accord  with  analysis  of  variance  methodology.  To  enable 
comparison  of  such  variance  determination  with  classical  analysis  of  variance,  it  is 
necessary  to  transform  Table  2.107  into  Table  2.108. 


Table  2.108  Analysis  of  variance 


Factors 

C(-) 

C(+) 

B(-) 

B(+) 

B(-) 

B(+) 

6.08 

6.53 

6.79 

6.73 

A(-) 

(1) 

b 

c 

be 

6.31 

6.12 

6.77 

6.49 

6.04 

6.43 

6.68 

6.08 

A(+) 

a 

ab 

ac 

abc 

6.09 

6.36 

6.38 

6.23 
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According  to  calculation  of  analysis  of  variance  from  Sect.  1.5  we  get: 

EEEE  ^1=652.70;  y...  = 102.11  yL  = 12426.45; 

E y?...=51.822+50,292=5214,40 

i 

E y^-..  =(24.52+25. 44)2+(26.62+25.53)2=5215. 62 

j 

E =(24.52+26. 62)2+(25. 44+25. 53)2=5213. 24 

k 

E E y^««=(12. 39+12. 65)2  +(13. 56+13. 22)2+(12. 13+12. 79)2+(13. 06+12. 31)2 
=2608.82 

E E yf.u=(12-39+13. 56)2+(12. 13+13. 06)2+(12. 65+13. 22)2+(12. 79+12. 31)2 

i k 

=2607.21 

E E y»ji£»=(12-39+12.13)2+(12.65+12.79)2+(13. 56+13.06)2+(13. 22+12. 31)2 
=2608.82 

E E E y^,=12.392+12.652+13.562+13.222+12.132+12.792+13.062+12.312 

i j k 

=1305.00 

SST=652. 70-10426.45/16=1. 045 
SSC=5215. 62/8-651. 65=0. 300 
SSR=5214.40/8-651. 65=0. 146 
SSL=5213. 24/8-651. 65=0.002 
SSCR=2608. 82/4-5214.40/8-5215. 62/8+651. 65=0. 104 
SSCL=2608. 82/4-5215. 62/8-5213. 24/8+651. 65=0.248 
SSRL=2607. 21/4-5214.40/8-5213. 24/8+651. 65=0. 000 
SSE=652. 70-1305. 00/2=0. 200 

SSCRL=1.045-0. 300-0. 146-0.002-0.104-0. 248-0. 200=0.04 

The  completed  table  of  analysis  of  variance  is  given  in  solutions  of  Problem  1.26 

Example  2.30  [21] 

A research  expansion  with  another  factor  has  been  done  in  the  previous  example. 
Hence,  the  effects  of  these  factors  have  been  analyzed  in  this  research: 


Lower  level  Upper  level 

A concentration  of  basic  raw  material,  % 87  93 

B duration  of  reaction,  min  15  30 

C quantity  of  basic  raw  material,  ml  35  45 

D temperature  of  reaction,  °C  60  70 


The  experiment  has  been  done  through  the  matrix  of  full  factorial  experiment  24, 
as  shown  in  Table  2.109.  Each  trial  has  been  done  only  once,  with  no  replications. 
The  sequence  of  doing  trials  has  been  completely  random. 


Table  2.109 
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No.  Yates 
trials 

Design  matrix  Operational 

matrix 

Response 

T A B AB  C AC  BC  ABC  D AD  BD  ABD  CD  ACD  BCD  ABCD  A B C D 

Yu 

1 

(1) 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

+ 

- 

- 

+ 

87 

15 

35 

60 

6.08 

2 

a 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

+ 

+ 

- 

- 

93 

15 

35 

60 

6.04 

3 

b 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

+ 

- 

+ 

- 

87 

30 

35 

60 

6.53 

4 

ab 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

- 

- 

- 

- 

+ 

+ 

+ 

+ 

93 

30 

35 

60 

6.43 

5 

c 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

87 

15 

45 

60 

6.31 

6 

ac 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

93 

15 

45 

60 

6.09 

7 

be 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

87 

30 

45 

60 

6.12 

8 

abc 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

- 

- 

- 

- 

93 

30 

45 

60 

6.36 

9 

a 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

87 

15 

35 

70 

6.79 

10 

ad 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

93 

15 

35 

70 

6.68 

11 

bd 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

87 

30 

35 

70 

6.73 

12 

abd 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

93 

30 

35 

70 

6.08 

13 

cd 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

87 

15 

45 

70 

6.77 

14 

acd 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

93 

15 

45 

70 

6.38 

15 

bed 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

87 

30 

45 

70 

6.49 

16 

abed 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

93 

30 

45 

70 

6.23 

According  to  Eq.  (2.67)  we  get: 

A=-0.191;  B=-0.021;  AB=-0.001;  C=-0.076;  AC=0.034; 

BC=-0.066;  ABC=0.149;  D=0.274;  AD=-0.161;  BD=-0.251; 

ABD=-0.101;  CD=-0.026;  ACD=-0.006;  BCD=0.124;  ABCD=0.019 

Associated  sums  of  squares  are: 

SSA=MSA=2k'2A2=22(-0.191)2=0.1463; 

MSB=0.0018;  MSc=0.0233;  MSd=0.2998;  MSab=0.0000; 

MSac=0.0046;  MSad=0.1040;  MSbc=0.0176;  MSbd=0.2525; 

MScd=0.0028;  MSabc=0.0885;  MSabd=0.0410;  MSacd=0.0002; 

MSbcd=0.0613;  MSabcd=0.0014. 


By  transforming  Table  2.109  into  Table  2.110  we  get  analysis  of  variance: 
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Table2.n0  Analysis  ofvariance 


D(-) 

D(+) 

C(-) 

C(+) 

C(-) 

q+i 

A(-) 

B(-) 

(1)  6.08 

c 6.31 

d 6.79 

cd  6.77 

B(+) 

B 6.53 

be  6.12 

bd  6.73 

bed  6.49 

A(+) 

B(-) 

A 6.04 

ac  6.09 

ad  6.68 

acd  6.38 

B(+) 

Ab  6.43 

abc  6.36 

abd  6.08 

abed  6.23 

The  outcomes  of  analysis  of  variance  are  shown  in  Table  2.111. 

Table  2.111  Results  of  analysis  ofvariance 


Sources  of  variation 

f 

MS 

F 

Fl;5;0.95 

Concentration  A 

1 

0.1463 

0.76 

6.61 

Reaction  time  B 

1 

0.0018 

0.01 

6.61 

Quantity  of  raw  material  C 

1 

0.0233 

0.12 

6.61 

Temperature  of  reaction  D 

1 

0.2998 

1.56 

6.61 

AB 

1 

0.0000 

0.00 

6.61 

AC 

1 

0.0046 

0.02 

6.61 

AD 

1 

0.1040 

0.54 

6.61 

BC 

1 

0.0176 

0.09 

6.61 

BD 

1 

0.2525 

1.31 

6.61 

CD 

1 

0.0028 

0.01 

6.61 

ABC+ABD+ACD+BCD+ABCD 

5 

0.1924 

- 

- 

Total 

15 

- 

- 

- 

Example  2.31  [25] 

A full  factorial  experiment  has  been  done  in  a pilot-plant.  The  research  included 
refinement  of  a product  by  steam  distillation.  Five  factors  have  been  analyzed,  each 
one  at  two  levels:  A concentration,  B flow,  C volume  of  solution,  D mixing  speed  and 
E solvent  and  water  ratio.  Acidity  of  the  product  in  each  of  32  trials  has  been  ana- 
lyzed as  the  response.  Outcomes  in  coded  forms  are  shown  in  Table  2.112. 

Data  from  Table  2.112  have  been  analyzed  by  the  Yates  technique  and  outcomes 
are  given  in  Table  2.113.  The  interesting  thing  in  relation  to  the  former  example  is 
that  the  mechanical  method,  which  does  not  require  knowledge  of  Eq.  (2.67)  has 
been  demonstrated.  Column  (1)  is  obtained  by  adding  up  the  response  data  pairs  to 
the  column  and  then  by  subtracting  the  data.  For  example,  19=9+10,  14=8+6,. ..,11= 
5+6,  1=10-9,  -2=6-8, ...,1=6-5.  As  shown,  differences  are  taken  from  the  same  data 
pairs  but  in  this  way:  the  second  data  minus  the  first,  the  fourth  minus  the  third 
and  so  on  to  the  column  end.  Column  (2)  is  obtained  from  the  first  column  in  the 
same  way.  Column  (3)  from  (2),  (4)  from  (3)  and  (5)  from  (4).  This  calculation  is 
evidently  repeated  k times  for  a full  factorial  experiment  of  2k.  Column  (5)  gives 


Table  2.112  Full  factorial  experiment  25 
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A(-) 

AM 

D(-) 

DM 

D(-) 

DM 

E(-) 

E(+) 

E(-) 

EM 

E(-) 

EM 

E(-) 

EM 

B(-) 

c(-) 

9 

3 

11 

8 

10 

9 

13 

7 

C(+) 

3 

5 

7 

7 

5 

6 

10 

7 

B(+) 

C(-) 

8 

4 

9 

8 

6 

6 

16 

6 

C(+) 

6 

4 

7 

5 

10 

10 

13 

6 

total  effects  of  the  factors  and  interactions.  Average  effects  are  obtained  by  dividing 
the  totals  with  N/2. 

At  the  end,  the  last  column  represents  the  sum  of  squares  of  factors  and  interac- 
tions. This  one  can  be  obtained  by  dividing  the  square  of  values  of  column  (5)  (total 
effects)  by  the  total  number  of  trials  N=2k. 

Table  2.113  Yates  method  25 


Trials 

Response 

M 

(2) 

(3) 

(4) 

(5) 

Total  effects 

Average 

effects(5)/16 

SS=MS 

(5)2/32 

(1) 

9 

19 

33 

57 

143 

244 

T 

- 

- 

a 

10 

14 

24 

86 

101 

36 

16A 

2.250 

40.500 

b 

8 

8 

49 

47 

23 

4 

16B 

0.250 

0.500 

ab 

6 

16 

37 

54 

13 

8 

16AB 

0.500 

2.000 

c 

3 

24 

22 

5 

7 

-22 

16C 

-1.375 

15.125 

ac 

5 

25 

25 

18 

-3 

10 

16  AC 

0.625 

3.125 

be 

6 

17 

29 

15 

7 

18 

16BC 

1.125 

10.125 

abc 

10 

20 

25 

-2 

1 

14 

16ABC 

0.875 

6.125 

d 

11 

12 

-1 

3 

-21 

36 

16D 

2.250 

40.500 

ad 

13 

10 

6 

4 

-1 

-4 

16AD 

-0.250 

0.500 

bd 

9 

11 

9 

1 

7 

-4 

16BD 

-0.250 

0.500 

abd 

16 

14 

9 

-4 

3 

8 

16ABD 

0.500 

2.000 

cd 

7 

15 

8 

-1 

15 

-10 

16CD 

-0.625 

3.125 

acd 

10 

14 

7 

8 

3 

-2 

16ACD 

-0.125 

0.125 

bed 

7 

14 

-3 

1 

3 

-18 

16BCD 

-1.125 

10.125 

abed 

13 

11 

1 

0 

11 

-14 

16ABCD 

-0.875 

*6.125 

e 

3 

1 

-5 

-9 

29 

-42 

16E 

-2.625 

55.125 

ae 

9 

-2 

8 

-12 

7 

-10 

16AE 

-0.625 

3.125 

be 

4 

2 

1 

3 

13 

-10 

16BE 

-0.625 

3.125 

abe 

6 

4 

3 

-4 

-17 

-6 

16ABE 

-0.375 

1.125 

ce 

5 

2 

-2 

7 

1 

20 

16CE 

1.250 

12.500 

ace 

6 

7 

3 

0 

-5 

-4 

16ACE 

-0.250 

0.500 
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Table  2.113  (continued) 


Trials  Response 

(1) 

(2) 

(3) 

(4) 

(5) 

Total  effects  Average  SS=MS 

effects(5)/16  (5)2/32 

bee 

4 

3 

-1 

-1 

9 

-12 

16BCE 

-0.750 

4.500 

abce 

10 

6 

-3 

4 

-1 

8 

16ABCE 

0.500 

*2.000 

de 

8 

6 

-3 

13 

-3 

-22 

16DE 

-1.375 

15.125 

ade 

7 

2 

2 

2 

-7 

-30 

16ADE 

-1.875 

28.125 

bde 

8 

1 

5 

5 

-7 

-6 

16BDE 

-0.375 

1.125 

abde 

6 

6 

3 

-2 

5 

-10 

16ABDE 

-0.625 

*3.125 

ede 

7 

-1 

-4 

5 

-11 

-4 

16CDE 

-0.250 

0.500 

aede 

7 

-2 

5 

-2 

-7 

12 

16ACDE 

0.750 

*4.500 

bede 

5 

0 

-1 

9 

-7 

4 

16BCDE 

0.250 

*0.500 

abede 

6 

1 

1 

2 

-7 

0 

16ABCDE 

0.000 

*0.000 

Total 

244 

If  in  Table  2.113  we  look  at  values  of  the  sum  of  squares,  we  may,  without  testing 
them  by  Bartletts  criterion,  assume  that  the  fourth-  and  fifth-order  interactions  are 
insignificant.  That  is,  those  sums  of  squares,  marked  by  asterisks,  may  be  used  in 
analysis  of  variance  in  Table  2.114  as  residual  variance. 

Table  2.114  Analysis  ofvariance 


Sources  of  variations 

ss 

f 

MS 

F 

Fl;6;0.95 

A 

40.500 

1 

40.500 

14.96 

5.99 

B 

0.500 

1 

0.500 

- 

5.99 

C 

15.125 

1 

15.125 

5.58 

5.99 

D 

40.500 

1 

40.500 

14.96 

5.99 

E 

55.125 

1 

55.125 

20.36 

5.99 

AB 

2.000 

1 

- 

- 

- 

AC 

3.125 

1 

- 

- 

- 

AD 

0.500 

1 

- 

- 

- 

AE 

3.125 

1 

- 

- 

- 

BC 

10.125 

1 

- 

- 

- 

BD 

0.500 

1 

- 

- 

- 

BE 

3.125 

1 

- 

- 

- 

CD 

3.125 

1 

- 

- 

- 

CE 

12.500 

1 

12.500 

4.62 

5.99 

DE 

15.125 

1 

15.125 

5.58 

5.99 

ABC 

6.125 

1 

- 

- 

- 

ABD 

2.000 

1 

- 

- 

- 

ABE 

1.125 

1 

- 

- 

- 

ACD 

0.125 

1 

- 

- 

- 

Table  2.114  (continued) 
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Sources  of  variations 

SS 

f 

MS 

F 

Fl;6;0.95 

ACE 

0.500 

1 

- 

- 

- 

ADE 

28.125 

1 

28.125 

10.4 

5.99 

BCD 

10.125 

1 

- 

- 

- 

BCE 

4.500 

1 

- 

- 

- 

BDE 

1.125 

1 

- 

- 

- 

CDE 

0.500 

1 

- 

- 

- 

Residual 

16.250 

6 

2.708 

- 

- 

Total 

275.500 

31 

- 

- 

- 

Example  2.32  [25] 

In  developing  a practical  industrial  fermentation  process,  we  usually  start  with  lab 
studies  of  micro-organism  physiological  requirements.  Micro-organisms  are  culti- 
vated in  a liquid  media  and  their  growth  depends  on  the  substance  being  formed  in 
that  medium.  Formation  of  the  substance,  according  to  previous  research,  depends 
on  two  components  in  the  medium  and  two  environmental  factors:  temperature 
and  aeration.  The  first  two  factors  are  X!  and  X2  and  the  other  two  X3  and  X4.  The 
experiment  has  been  done  by  a full  factorial  type  24  with  a single  replication.  Out- 
comes of  experiment  are  shown  in  Table  2.115. 

Table  2.115  Full  factorial  experiment  24 


Xi(-) 

Xi(+) 

x2(-) 

x2(+) 

x2() 

x2(+) 

x3 

32.7 

90.4 

70.6 

115.0 

X4 

(-) 

19.3 

89.8 

84.5 

108.6 

(-) 

x3 

20.2 

94.1 

76.1 

133.6 

(+) 

29.9 

96.5 

73.3 

131.6 

X3 

50.0 

72.6 

104.2 

81.3 

x4 

(-) 

52.1 

76.9 

103.4 

88.2 

(+) 

X3 

50.5 

91.8 

78.6 

108.3 

M 

49.1 

86.9 

74.1 

108.3 

The  results  of  analysis  of  data  by  Yates  methodology  are  shown  in  Table  2.116. 
Note  that,  here,  the  sums  of  replicate  trials  have  been  taken  as  response  values. 
Therefore,  when  calculating,  average  effects  are  divided  2(N/2).  The  situation  is  sim- 
ilar when  calculating  sums  of  factor  squares  and  interactions.  Sum  of  squares  are 
calculated  by  dividing  squares  of  total  effects  by  total  number  of  data  N=n2k  (n-num- 
ber  of  replicate  trials).  Residual  sum  of  squares,  as  said  before,  has  been  determined 
according  to  the  equation: 
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SSE 


N n 

EE  (yHi-y»)2 


i i 


n—  1 


321.6 


MSe 


SSE  _ 321.6 

ET  “ 16 


20.1 


Table  2.116  Yates  method 


Trials 

Response 

P) 

(2) 

P) 

(4) 

Average 
effects  (4)/16 

MS  (4)2/32 

F 

Fl;16;0.95 

(1) 

52.0 

207.1 

610.9 

1239.6 

2542.5 

- 

- 

- 

- 

x, 

155.1 

403.8 

628.7 

1302.9 

536.9 

33.56 

9008.2 

448.17 

4.49 

X2 

180.2 

309.7 

655.3 

272.0 

605.3 

37.83 

11449.6 

569.63 

4.49 

X,x2 

223.6 

319.0 

647.6 

264.9 

-185.1 

-11.57 

1070.7 

53.27 

4.49 

X3 

102.1 

199.5 

146.5 

206.0 

10.1 

0.63 

3.2 

0.16 

4.49 

x,x:i 

207.6 

455.8 

125.5 

399.3 

-103.9 

-6.49 

337.4 

16.79 

4.49 

X2X3 

149.5 

252.3 

173.9 

-145.2 

-300.7 

-18.79 

2825.6 

140.58 

4.49 

X,x2x3 

169.5 

395.3 

91.0 

-39.9 

-16.3 

-1.02 

8.3 

0.41 

4.49 

X4 

50.1 

103.1 

196.7 

17.8 

63.3 

3.96 

125.2 

6.23 

4.49 

X,x4 

149.4 

43.4 

9.3 

-7.7 

-7.1 

-0.44 

1.6 

0.08 

4.49 

X2X4 

190.6 

105.5 

256.3 

-21.0 

193.3 

12.08 

1167.7 

58.09 

4.49 

x,x2x4 

265.2 

20.0 

143.0 

-82.9 

105.3 

6.58 

346.5 

17.24 

4.49 

X3x4 

99.6 

99.3 

-59.7 

-187.4 

-25.5 

-1.59 

20.3 

1.01 

4.49 

X,  x,x4 

152.7 

74.6 

-85.5 

-113.3 

-61.9 

-3.87 

119.7 

5.96 

4.49 

X2X3X4 

178.7 

53.1 

-24.7 

-25.8 

74.1 

4.63 

171.6 

8.54 

4.49 

X|X2X3X4 

216.6 

37.9 

-15.2 

9.5 

35.3 

2.21 

38.9 

1.94 

4.49 

Residual 

- 

- 

- 

- 

- 

- 

20.1 

- 

- 

Total 

- 

- 

- 

- 

- 

- 

- 

- 

- 

Example  2.33  [31] 

This  example  refers  to  the  use  of  fractional  replica  FUFE  in  testing  adhesion  of  a 
thermoplastic  polymer  and  fibre  with  inclusion  of  k=7  factors.  Application  of  a frac- 
tional replica  has  proved  to  be  specially  useful  since  FUFE  requires  27=128  trials 
and  enormous  time  consumption.  FRFE  type  27'3  with  only  16  trials  has  been  used. 
FUFE  24  has  been  used  to  construct  the  fractional  replica.  Here,  the  following  gener- 
ating ratios  have  been  introduced:  X5=X1X2X4;  X6=X2X3X4;  X7=X1X2X3X4.  Excluding 
all  effects  of  triple  and  higher  interactions,  these  aliased/confounded  estimates  of 
regression  coefficients  have  been  obtained: 

K = P3  + P67;  b2  = |32;  b3  = (33  + |357; 

^s=P6  + P17;  b7  = P7  + P35  + pi6;  b12 


fo4  = P4;  fo5  = P5  + P37; 

= Pl2+P45’  ^13  = Pl3  + P56 ; 
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£>14  = P14  + P2S  ’ ^23=1^23+^46’  ^24  = P24  + P36  + P15  i ^34  = P34  + P26  i 

h27  = P27’  fo47  = P47  ’ 

Factors  signs  with  variation  intervals  are  shown  in  Table  2.117. 

Table  2.117  Factors  with  variation  intervals 


Factors 

- 

Variation  intervals 
0 

+ 

Ax 

xj  pressing  pressure,  kp/cm2 

5.0 

12.5 

20.0 

7.5 

X2  thermal  processing  time,  min 

0.5 

1.5 

2.5 

1.0 

x3  temperature,  °C 

140 

155 

170 

15 

X4  pressing  time,  min 

1 

2 

3 

1 

X5  dibutylphthalate,  % 

0 

0.5 

1.0 

0.5 

x^  tearing  speed,  m/min 

80 

140 

200 

60 

X7  fiber  type 

VISCOSE 

- 

KAPRON 

- 

The  design  matrix  with  the  outcomes  of  experiment 

is  shown 

in  Table  2.118. 

Trials  have  been  replicated  20  times  due  to  the  unreliable  method  of  measuring 
adhesion. 

Table  2.118  Fractional  factorial  experiment  2?  5 


No.  Design  matrix  Operational  matrix  Response 

trials 


X0 

X, 

x2 

X3 

x4 

X5 

X6 

X7 

Xi 

*2 

x3 

x4 

x5 

X6 

x7 

Yu 

1 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

20 

2.5 

170 

3 

1.0 

200 

K 

17.99 

2 

+ 

- 

+ 

+ 

+ 

- 

+ 

- 

5 

2.5 

170 

3 

0 

200 

V 

15.31 

3 

+ 

+ 

- 

+ 

+ 

- 

- 

- 

20 

0.5 

170 

3 

0 

80 

V 

17.16 

4 

+ 

- 

- 

+ 

+ 

+ 

- 

+ 

5 

0.5 

170 

3 

1.0 

80 

K 

14.75 

5 

+ 

+ 

+ 

- 

+ 

+ 

- 

- 

20 

2.5 

140 

3 

1.0 

80 

V 

35.49 

6 

+ 

- 

+ 

- 

+ 

- 

- 

+ 

5 

2.5 

140 

3 

0 

80 

K 

33.17 

7 

+ 

+ 

- 

- 

+ 

- 

+ 

+ 

20 

0.5 

140 

3 

0 

200 

K 

38.30 

8 

+ 

- 

- 

- 

+ 

+ 

+ 

- 

5 

0.5 

140 

3 

1.0 

200 

V 

24.39 

9 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

20 

2.5 

170 

1 

0 

80 

V 

32.23 

10 

+ 

- 

+ 

+ 

- 

+ 

- 

+ 

5 

2.5 

170 

1 

1.0 

80 

K 

45.64 

11 

+ 

+ 

- 

+ 

- 

+ 

+ 

+ 

20 

0.5 

170 

1 

1.0 

200 

K 

41.17 

12 

+ 

- 

- 

+ 

- 

- 

+ 

- 

5 

0.5 

170 

1 

0 

200 

V 

19.17 

13 

+ 

+ 

+ 

- 

- 

- 

+ 

+ 

20 

2.5 

140 

1 

0 

200 

K 

17.55 

14 

+ 

- 

+ 

- 

- 

+ 

+ 

- 

5 

2.5 

140 

1 

1.0 

200 

V 

14.49 

15 

+ 

+ 

- 

- 

- 

+ 

- 

- 

20 

0.5 

140 

1 

1.0 

80 

V 

18.52 

16 

+ 

- 

- 

- 

- 

- 

- 

+ 

5 

0.5 

140 

1 

0 

80 

K 

12.50 
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Based  on  the  obtained  experimental  results,  regression  coefficients  of  linear 
regression  have  been  determined  by  using  Eqs.  (2.62)-(2.64). 

Y=24.86+2.44  Xj+1.62  X2+0.56  X3-0.30  X4+3.I4  Xs-1.33  X6+2.77  X7 


Example  2.34 

To  produce  a textile  material  resistant  to  fire,  these  four  qualitative  factors  have  been 
tested: 


Factors 

1.  A textile  material 

2.  B textile  treatment 

3.  C washing  conditions 

4.  D direction  of  testing 


Lower  level 
satin 

treatment  X 
before  washing 
alongside 


Upper  level 
monks 
treatment  Y 
after  washing 
crosswise 


The  experiment  was  done  by  FRFE  matrix  of  type  24'1.  The  response  was  the 
burnt-out  part  of  the  textile  25.4mm  long.  Determine  the  factor  and  interaction 
effects,  since  there  is  no  point  in  determining  regression  coefficients  as  the  factors 
are  qualitative.  To  construct  fractional  24'1,  the  design  2 from  Table  2.102  was  used. 
The  sequence  of  trials  in  the  design  matrix  has  been  completely  random.  Outcomes 
are  shown  in  Table  2.119. 


_ 4—1 

Table  2.119  Fractional  replica  2 


A(-) 

A(+) 

B(-)  B(+) 

B(-) 

B(+) 

c 

D(-) 

(1)  4.2  - 

- 

ab  2.9 

(-) 

D(+) 

bd  5.0 

ad  3.0 

- 

c 

D(-) 

- be  4.6 

ac  2.8 

- 

(+) 

D(+) 

cd  4.0  - 

- 

abed  2.3 

Table  2.120 

4—1 

• Yates  analysis  2 

Trials 

Response 

(1) 

(2) 

(3) 

Average  effects  (3)/4 

Estimated  effects 

(1) 

4.2 

7.2 

15.1 

28.8 

- 

T 

ad 

3.0 

7.9 

13.7 

-6.8 

-1.70 

A 

bd 

5.0 

6.8 

-3.3 

0.8 

0.20 

B 

ab 

2.9 

6.9 

-3.5 

-2.0 

-0.50 

AB+CD 

cd 

4.0 

-1.2 

0.7 

-1.4 

0.35 

C 

ac 

2.8 

-2.1 

0.1 

-0.2 

-0.05 

AC+BD 

be 

4.6 

-1.2 

-0.9 

-0.6 

-0.15 

BC+AD 

abed 

2.3 

-2.3 

-1.1 

-0.2 

-0.05 

D 

Total 

28.8 

- 

- 

- 

- 

- 
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The  Yates  procedure  as  demonstrated  in  Example  2.31  was  used  to  estimate  the 
effects.  Here,  one  should  do  the  following:  shift  k=k'  where  k'=k-p.  Total  effects 
divided  by  2k_1  offer  average  effects.  Outcomes  of  Yates  analysis  are  shown  in 
Table  2.120. 

Example  2.35  [32] 

The  following  chemical  reaction  has  been  studied  in  a chemical  laboratory: 

A+B+C— >D+other  products 

The  reaction  takes  place  with  solvent  E being  present.  Five  factors,  as  shown  in 
Table  2.121,  were  varied  in  lab  conditions.  In  designing  the  experiment  it  has  been 
assumed  that  interaction  effects  are  not  significant.  The  experiment  was  therefore 
done  by  25'2  or  1/4-replica  each.  The  design  matrix  with  experimental  outcomes  is 
shown  in  Table  2.122. 

Table  2.121  Factors  and  intervals 


Factors 

Levels 

- 

+ 

X}  quantity  of  solvent  E,  cm3 

200 

250 

X2  quantity  C,  mol/molA 

4.0 

4.5 

X3  concentration  C,  % 

90 

93 

X4  reaction  time,  h 

1 

2 

X5  quantity  B,  mol/molA 

3.0 

3.5 

These  generating  ratios  have  been  used  to  construct  the  matrix:  X4=X1X2X3;X5=-X2X3 


Table  2.122  Fractional  factorial  experiment  25  2 


No. 

trials 

Design  matrix 

Operational  matrix 

Response 

X, 

x2 

x3 

x4 

X5 

*1 

x2 

x3 

x4 

x5 

y„  % 

1 

- 

- 

- 

- 

- 

200 

4.0 

90 

1 

3 

34.4 

2 

- 

- 

+ 

+ 

+ 

200 

4.0 

93 

2 

3.5 

51.6 

3 

- 

+ 

- 

+ 

+ 

200 

4.5 

90 

2 

3.5 

31.2 

4 

- 

+ 

+ 

- 

- 

200 

4.5 

93 

1 

3 

45.1 

5 

+ 

- 

- 

+ 

- 

250 

4.0 

90 

2 

3 

54.1 

6 

+ 

- 

+ 

- 

+ 

250 

4.0 

93 

1 

3.5 

62.4 

7 

+ 

+ 

- 

- 

+ 

250 

4.5 

90 

1 

3.5 

50.2 

8 

+ 

+ 

+ 

+ 

- 

250 

4.5 

93 

2 

3 

58.6 
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The  following  regression  coefficients  have  been  obtained  by  processing  the 
results: 

b0=48.5;  b1=b1-b45=7.9;  b2=b2-b35=-2.2;  b3=b3-b25=6.0; 

bi23=b4-bi5=0.4;  b23=b5-b14+b23=-0.4;  b13=b13+b24=-1.8;  b12=bi2+b34=0.2. 


Example  2.36  [12] 

All  the  thus  far  formalized  steps  in  defining  and  performing  an  experiment  are 
being  demonstrated  in  this  example.  The  process  of  separating  mercury  from  caus- 
tic, as  part  of  the  process  of  extraction  in  a batch  reactor  with  a mixer,  is  being 
tested. 

Selection  of  system  response 

Mercury  content  at  the  process  outlet  has  been  accepted  as  the  system  response. 
The  response  is  quantitative  and  it  can  be  easily  measured  on  an  atomic  absorption 
spectrometer  with  an  accuracy  of  up  lxlO's%. 

Selection  of  factors 

These  factors  affect  the  mercury  content  at  the  process  outlet: 

1)  RPM  of  the  mixer,  min'1; 

2)  dwell  time  of  caustic  solutions,  min; 

3)  temperature  of  solution,  °C; 

4)  initial  mercury  content  in  caustic,  %; 

5)  quantity  of  extraction  material,  g; 

6)  Solubility  of  mercury  in  the  solvent,  mg/g. 

Previous  experiment 

A one-factor  experiment  was  done,  which  showed  that,  in  the  assumed  experimental 
region,  the  quantity  of  extracted  material  and  its  saturation  with  mercury  are  of  no 
importance.  As  in  this  experiment,  a caustic  with  constant  mercury  content  of 
12xl0'4%  was  used,  this  factor  is  also  excluded  from  considerations.  After  this  selec- 
tion the  following  factors  have  remained: 

x3  number  of  mixer  rotations,  min'1;  x2  temperature  of  solution,  °C; 
x3  dwell  time  of  solution,  min. 

Description  of  experimental  equipment 

Mercury  extraction  is  done  in  the  lab  glass  reactor  consisting  of  a thick  glass  vessel 
69  mm  in  diameter  and  200  mm  high.  The  vessel  is  electrically  heated  with  a possi- 
bility of  regulation  of  +1  °C.  Mixing  is  done  by  a glass  mixer  with  two  blades  and  a 
possibility  of  regulating  rotation  from  250  to  4000  min'1.  The  reactor  construction 
makes  both  the  batch  and  the  continual  operations  possible.  Industrial  nonfiltered 
caustic  solution  of  43  to  46%  is  used  in  the  experiment.  Mercury  content  in  such  a 
solution  has  been  from  8 x 10'4  to  18  x 10'4%. 

Definition  of  the  problem 

Find  extraction  conditions  of  mercury  from  caustic  solution  so  that  at  the  process 
outlet  we  get  the  minimal  mercury  content. 
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Factor  space 

These  limits  of  factors  were  selected  for  all  factors  based  on  previous  experiments 
and  theoretical  knowledge: 

1000  < x3  < 4000  min'1  70<x2<130°C  0 < x3  < 100  min 

Selection  of  null  level 

This  experimental  center  was  suggested  based  on  previous  information: 
x10=2500  min'1  x20=100  °C  x30=45  min 

Selection  of  variation  interval 

These  variation  intervals  were  chosen  to  realize  the  basic  experiment: 

Axj=500  min'1  Ax2=10  °C  Ax3=15  min 
Construction  of  the  design  matrix 

Mathematical  modeling  of  the  process  has  to  be  done  according  to  the  problem  defi- 
nition. FUFF  is  therefore  used  with  double  replication  of  design  points.  The  design 
matrix  with  experimental  outcomes  is  shown  in  Table  2.123 

Table  2.123  Full  factorial  experiment  23 


Name 

x2 

x3 

Calculated  regression  coefficients 

Basic 

2500 

100 

45 

b„=2.95  x 10'4; 

bj=0.45  x 10'4; 

Variation  interval 

500 

10 

15 

b2=-1.066  x 10'4;  b3=0.92  x 10'4; 

Upper  level 

3000 

110 

60 

y = 2, 95  ■ 

10  4 +0,45  ■ 10 

- 1, 066  ■ 10  SX2  + 0, 92  ■ 10  4X3 

Lower  level 

2000 

90 

30 

Trials 

Xo 

Design  matrix 

Operational  matrix 

Response  x 104 

x, 

X2 

x3 

x2  x3 

Yui  yu2  Yu 

1 

+ 

+ 

+ 

+ 

3000 

110  60 

1.09  0.71  0.90 

2 

+ 

- 

+ 

+ 

2000 

110  60 

1.34  0.94  1.14 

3 

+ 

+ 

- 

+ 

3000 

90  60 

3.07  2.65  2.86 

4 

+ 

- 

- 

+ 

2000 

90  60 

3.42  3.02  3.22 

5 

+ 

+ 

+ 

- 

3000 

110  30 

2.90  2.50  2.70 

6 

+ 

- 

+ 

- 

2000 

110  30 

3.01  2.59  2.80 

7 

+ 

+ 

- 

- 

3000 

90  30 

3.74  3.34  3.54 

8 

+ 

- 

- 

- 

2000 

90  30 

6.64  6.26  6.45 

A check  of  statistical  significance  must  be  done  for  the  calculated  regression  coef- 
ficients and  a check  of  lack  of  fit  for  the  regression  model.  Both  checks  are  a subject 
of  statistical  analysis  that  will  be  elaborated  in  more  detail  in  the  next  chapter.  The 
check  of  the  obtained  regression  model  has  shown  that  it  is  inadequate,  so  that  we 
have  to  reduce  variation  intervals  of  factors  and  increase  the  number  of  design-point 
replications. 
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Example  2.37  [12] 

The  effects  of  temperature,  solution  concentration  and  duration  of  the  process  on 
the  crystallization  speed  of  aluminum  fluorid  (A1F3)  have  been  tested  in  industrial 
conditions  of  its  production.  Average  crystallization  of  A1F3  in%/h  has  been  taken 
as  the  system  response.  FUFE  2?  with  three  parallel  design  points  in  the  experimen- 
tal center  has  been  used  as  a design  of  the  experiment.  These  three  parallel  design 
points  are  used  to  estimate  experimental  error  that  is  necessary  for  checking  the  sig- 
nificance of  regression  coefficients  and  lack  of  fit  of  the  obtained  regression.  All 
design  points  have  not  been  replicated  in  this  case  as  the  experiment  has  been  done 
in  industrial  conditions.  The  design  matrix  with  outcomes  of  the  test  and  data  pro- 
cessing are  shown  Table  2.124. 

Table  2.124  Full  factorial  experiment  23 


Name 

x2 

X3 

Linear  regression 

Basic  level 

90 

22 

2 

Variation  interval 

10 

4 

0.5 

? 

= 9.34  + 0.89Xj  + 2.15X2  + 1.41X3 

Upper  level 

100 

26 

2.5 

Lower  level 

80 

18 

1.5 

Trials 

»o 

Design  matrix 

Operational  matrix 

Response 

X, 

x2 

x3 

x2 

x3 

Yu 

1 

+ 

+ 

+ 

+ 

100 

26 

2.5 

9.86 

2 

+ 

- 

+ 

+ 

80 

26 

2.5 

9.09 

3 

+ 

+ 

- 

+ 

100 

18 

2.5 

6.35 

4 

+ 

- 

- 

+ 

80 

18 

2.5 

6.41 

5 

+ 

+ 

+ 

- 

100 

26 

1.5 

15.00 

6 

+ 

- 

+ 

- 

80 

26 

1.5 

12.02 

7 

+ 

+ 

- 

- 

100 

18 

1.5 

9.48 

8 

+ 

- 

- 

- 

80 

18 

1.5 

6.52 

9 

+ 

0 

0 

0 

90 

22 

2 

9.12 

10 

+ 

0 

0 

0 

90 

22 

2 

10.30 

11 

+ 

0 

0 

0 

90 

22 

2 

5.80 

Example  2.38  [12] 

The  process  of  obtaining  alkyl  sulphonate  in  an  autoclave  with  a mixer  has  been 
studied.  Basic  reagents  were:  water  solution  of  sodium  hydrosulfite  36-38%  and 
industrial  olefm  fractions  at  240-320  °C.  NaN03  and  oxygen  from  air  were  used  as 
initiators  of  the  reaction  of  free  radicals.  System  factors  are:  x3  reaction  time,  h;  x2 
temperature  of  reaction,  °C;  x3  mole  ratio  of  sodium  hydrosulfite  and  olefm;  X4  mole 
ratio  of  NaN03  and  olefm;  x5  volume  ratio  of  N-propanol  and  olefin.  System 
response  is  a yield  of  alkyl  sulfonate  as  a per  cent  of  theoretical  yield.  FRFE  design 
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25'2  has  been  used  in  the  experiment  with  these  generating  ratios:  X4=XiX2X3; 
X5=-XjX2.  Determine  coefficients  of  the  linear  regression  model. 

Determine  aliiased/confounded  effects  in  accord  with  generating  ratios  or  defin- 
ing contrasts: 

1 = X1X2X3X4  = -X1X2X5  = -X3X4X5 
Xj  = X2X3X4  = -X2X5  = -XjX3X4X5 
X2  = XjX3X4  = — x3x5  = — x2x3x4x5 
X3  = XjX2X4  = —X1X2X3X5  = -X4X5 
X4  = XjX2X3  = -X3X2X4X5  = -X3X5 
X5  = XjX2X3X4X5  = -XjX2  = -x3x4 

Taking  into  account  preliminary  information  on  the  insignificance  of  double,  tri- 
ple- and  higher-order  interactions,  we  obtain  these  estimates  of  main  effects: 

t>i=|3i+(3234-(325-(3i34s;  b2=p2+p134-(315-(32345; 

b3=p3+Pi24-Pi235-p45;  b4=p4+p123-p1245-p35;  b5=Ps+Pi234s-Pi2-p34. 

The  design  of  experiments  with  outcomes  is  shown  in  Table  2.125. 

Table  2.125  Fractional  factorial  experiment  25  2 


Name 

*1 

x2 

X3 

x4 

x5 

Regression  coefficients 

Basic  level 

2.0 

100 

1.5 

0.2 

2.0 

b0=27.21 

bi=4.837 

Variation  interval 

1.0 

10 

0.5 

0.1 

1.0 

b2=2.86 

b3=0.81 

Upper  level 

3.0 

110 

2.0 

0.3 

3.0 

b4=0.38 

b5=11.08 

Lower  level 

1.0 

90 

1.0 

0.1 

1.0 

Trials 

X0 

Design  matrix 

Operational  matrix 

Response 

X, 

X2 

x3 

x4 

x5 

*1  x2 

x3 

x4 

x5 

Yu 

1 

+ 

- 

- 

- 

- 

- 

1.0  90 

1.0 

0.1 

1.0 

14.5 

2 

+ 

+ 

+ 

- 

- 

- 

3.0  110 

1.0 

0.1 

1.0 

18.6 

3 

+ 

- 

- 

+ 

+ 

- 

1.0  90 

2.0 

0.3 

1.0 

13.8 

4 

+ 

+ 

- 

+ 

- 

+ 

3.0  90 

2.0 

0.1 

3.0 

51.0 

5 

+ 

- 

+ 

+ 

- 

+ 

1.0  110 

2.0 

0.1 

3.0 

23.2 

6 

+ 

+ 

- 

- 

+ 

+ 

3.0  90 

1.0 

0.3 

3.0 

41.0 

7 

+ 

- 

+ 

- 

+ 

+ 

1.0  110 

1.0 

0.3 

3.0 

38.0 

8 

+ 

+ 

+ 

+ 

+ 

- 

3.0  110 

2.0 

0.3 

1.0 

17.6 
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Problem  2.16 

Four  factors  have  been  varied  in  testing  extraction  of  hafnium  with 
tributyl  phosphate.  The  experiment  has  been  done  by  FUFE  2 4 
1 /2-replica  design.  The  design  matrix  with  experimental  outcomes 
is  shown  in  the  table  2.126.  Determine  linear  regression  coeffi- 
cients. 

Table  2.126  Fractional  factorial  experiment  24  1 


Trials 

X, 

X2 

X3 

X4 

Yu 

1 

- 

- 

- 

- 

0.001675 

2 

+ 

- 

+ 

- 

0.003465 

3 

- 

- 

+ 

+ 

0.001540 

4 

- 

+ 

- 

+ 

0.039750 

5 

+ 

+ 

- 

- 

0.03800 

6 

+ 

- 

- 

+ 

0.00470 

7 

- 

+ 

+ 

- 

0.51000 

8 

+ 

+ 

+ 

+ 

0.03975 

Problem  2.17 

The  surface  smoothness  of  the  machines  in  the  machine  building 
industry  is  of  the  order  of  a tenth  part  of  a millimeter.  Welded  parts 
on  such  a machine  must  be  processed  as  the  welding  may  be  2 mm 
or  more  thick.  To  reduce  this  thickness  to  the  lowest  possible  mea- 
sure and  to  save  electrodes  and  avoid  additional  machine  processing 
of  the  welding,  an  experiment  has  been  designed  to  model  the  influ- 
ence of  welding  parameters  on  welding  thickness.  Factors  and  varia- 
tion intervals  with  design  matrix  25 1 are  given  in  Tables  2.127  and 
2.128. 


Table  2.127  Factors  and  variation  intervals 


Factors 

Variation  levels 

Ax 

- 

0 

+ 

Xi  electrode  wear-out  rate,  m/h 

48 

56 

64 

8 

X2  welding  rate,  m/h 

28 

34 

40 

6 

X3  welding  step,  mm/OB 

5 

6 

7 

1 

X4  operational  voltage,  V 

20 

21 

22 

1 

X5  relative  position  of  electrodes 

0.250 

0.333 

0.416 

0.083 
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Table  2.128  Fractional  factorial  design  25  1 


N0 

x0 

Xt 

x2 

X3 

X4 

X5 

X3X2 

x,x3 

x,x4 

x,x5 

X2X3 

X2X4 

X2X5 

X3X4 

X3X5 

x4x5 

y4 

yb 

1 

+ 

- 

- 

- 

- 

+ 

+ 

+ 

+ 

- 

+ 

+ 

- 

+ 

- 

- 

2.20 

0.28 

2 

+ 

+ 

- 

- 

- 

- 

- 

- 

- 

- 

+ 

+ 

+ 

+ 

+ 

+ 

2.97 

0.45 

3 

+ 

- 

+ 

- 

- 

- 

- 

+ 

+ 

+ 

- 

- 

- 

+ 

+ 

+ 

1.60 

0.55 

4 

+ 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

- 

- 

+ 

+ 

- 

- 

1.98 

0.33 

5 

+ 

- 

- 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

+ 

+ 

- 

- 

+ 

1.90 

0.65 

6 

+ 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

- 

+ 

- 

2.20 

0.35 

7 

+ 

- 

+ 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

1.04 

0.63 

8 

+ 

+ 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

- 

- 

- 

- 

+ 

0.82 

1.79 

9 

+ 

- 

- 

- 

+ 

- 

+ 

+ 

- 

+ 

+ 

- 

+ 

- 

+ 

- 

2.31 

0.42 

10 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

+ 

- 

- 

- 

- 

+ 

2.73 

0.28 

11 

+ 

- 

+ 

- 

+ 

+ 

- 

+ 

- 

- 

- 

+ 

+ 

- 

- 

+ 

1.90 

0.36 

12 

+ 

+ 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

- 

+ 

- 

- 

+ 

- 

2.38 

0.35 

13 

+ 

- 

- 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

- 

- 

+ 

+ 

+ 

2.03 

0.26 

14 

+ 

+ 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

- 

+ 

+ 

- 

- 

2.27 

0.72 

15 

+ 

- 

+ 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

+ 

- 

+ 

- 

- 

1.17 

0.83 

16 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

1.55 

0.41 

Determine  regression  models  for  both  measured  responses. 


Problem  2.18 

Refinement  of  waste  nitric  gases  (NO+NO2),  is  particularly  signifi- 
cant from  both  the  point  of  view  of  environmental  protection  and 
the  use  of  components  they  contain.  A lab  plant  has  been  built  for 
this  purpose  where  effects  of  these  factors  have  been  tested: 


xi  consumption  of  ammonium  in  the  form  of  liquor  ammonia, 
X2  consumption  of  lye, 

X3  concentration  of  nitric  oxide  in  gas, 

X4  pH, 

xs  volume  of  gas  flow, 

X6  concentration  of  salts  in  circulating  lye. 


The  percentage  of  absorbed  nitric  oxides  has  been  measured  as  a 
response. 

To  obtain  the  mathematical  model  of  the  process,  1/4-replica  of  a 
full  factorial  experiment  of  type  26  has  been  realized.  Design  points- 
trials  have  been  done  in  a completely  random  order.  The  Table  2.129 
shows  conditions  and  outcomes  of  doing  a 2s’2  fractional  factorial 
experiment. 
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I Table  2.129  Fractional  factorial  design  26  2 


Name 

*i 

*2 

X3 

x4 

Xs 

x6 

Yu 

Basic  level 

1.25:1 

15.4 

0.4 

8.0 

3.0 

20 

Variation  interval 

0.22 

1.0 

0.1 

0.65 

1.0 

10 

Lower  level 

1.03:1 

14.4 

0.3 

7.35 

2.0 

10 

Upper  level 

1.48:1 

16.4 

0.5 

8.65 

4.0 

30 

i 

- 

- 

- 

- 

- 

- 

45.1 

2 

+ 

+ 

+ 

+ 

+ 

+ 

50.7 

3 

- 

- 

+ 

+ 

+ 

+ 

42.85 

4 

+ 

+ 

- 

- 

- 

- 

43.5 

5 

+ 

+ 

- 

+ 

- 

+ 

33.65 

6 

- 

- 

+ 

- 

+ 

- 

45.5 

7 

+ 

+ 

+ 

- 

+ 

- 

56.5 

8 

- 

- 

- 

+ 

- 

+ 

37.55 

9 

- 

+ 

- 

- 

+ 

+ 

35.5 

10 

+ 

- 

+ 

+ 

- 

- 

58.85 

11 

- 

+ 

+ 

+ 

- 

- 

50.47 

12 

+ 

- 

- 

- 

+ 

+ 

40.5 

13 

+ 

- 

- 

+ 

+ 

- 

56.7 

14 

- 

+ 

+ 

- 

- 

+ 

38.1 

15 

+ 

- 

+ 

- 

- 

+ 

59.0 

16 

- 

+ 

- 

+ 

+ 

- 

58.75 

Problem  2.19 

Describe  adequately  the  process  of  producing  seals  by  means  of  a 
polynomial  function.  Tensile  strength  at  break  kp/cm2  is  the 
response.  Factors  of  the  process  are:  xi  contents  of  parahinom- 
dioxine  in  mixture;  X2  contents  of  solvent  (all  in  grams  per 
100  grams  of  butylcaoutchouc).  Fractional  factorial  design  24  1 has 
been  used  in  the  experiment.  To  assert  reproducibility  of  the  experi- 
ment, trials  have  been  replicated  four  times.  The  design  matrix  with 
experimental  outcomes  is  shown  in  the  Table  2.130.  Determine  the 
linear  and  aliiased/confounded  effects  of  even  interactions. 
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Table2.130  Fractional  factorial  design  24  1 


Name 

*2 

x3 

*4 

Basic  level 

1.5 

4.25 

54 

95 

Variation  interval 

1.0 

0.5 

12 

5 

Upper  level 

2.5 

4.75 

66 

100 

Lower  level 

0.5 

3.75 

42 

90 

Trials 

X0 

Design  matrix 

Operational  matrix 

Response 

Xt 

x2 

x3 

x4 

Xl 

x2 

x3 

x4 

yui 

yU2 

yu3 

yu4 

Yu 

1 

+ 

- 

- 

- 

- 

0.5 

3.75 

42 

90 

4.2 

3.4 

4.0 

4.3 

3.975 

2 

+ 

+ 

- 

+ 

- 

2.5 

3.75 

66 

90 

4.7 

5.1 

5.6 

5.3 

5.175 

3 

+ 

- 

- 

+ 

+ 

0.5 

3.75 

66 

100 

4.3 

5.2 

4.7 

5.7 

4.975 

4 

+ 

- 

+ 

- 

+ 

0.5 

4.75 

42 

100 

3.6 

3.7 

3.9 

3.7 

3.725 

5 

+ 

+ 

+ 

- 

- 

2.5 

4.75 

42 

90 

4.5 

4.2 

4.4 

4.6 

4.425 

6 

+ 

+ 

- 

- 

+ 

2.5 

3.75 

42 

100 

4.0 

3.6 

4.5 

4.0 

4.050 

7 

+ 

- 

+ 

+ 

- 

0.5 

4.75 

66 

90 

4.9 

4.7 

5.1 

4.9 

4.900 

8 

+ 

+ 

+ 

+ 

+ 

2.5 

4.75 

66 

100 

5.0 

4.9 

5.1 

4.9 

4.975 

Problem  2.20 

To  control  the  complex  process  of  cooking  by  the  sulfate  process  cel- 
lulose, by  means  of  a computer,  it  is  necessary  to  have  a mathemati- 
cal model  of  the  process.  To  obtain  this  model  for  the  process  of 
cooking  by  the  sulfate  process  cellulose  from  a mixture  of  soft  and 
hardwood  and  deciduous  trees,  we  have  used  a fractional  factorial 
experiment.  It  included  these  seven  factors:  xi  consumption  of  active 
lye,  % Na20  on  completely  dry  wood;  X2  cooking  temperature,  °C; 

X3  hydromodule;  X4  cooking  time,  min;  xs  percentage  of  soft  and 
hardwood  chips,  %;  xc  percentage  of  normal  chips  fraction,  %;  X7  sul- 
fidity  of  the  lye,  %. 

The  percentage  of  coarse  chips  fraction  has  in  all  trials  been  con- 
stantly 5%.  The  contents  of  normal  chips  fraction  has  been  changed 
depending  on  the  design  of  the  experiment.  The  percentage  of  fine 
fraction  has  been  a result  of  the  difference  between  the  coarse  and 
normal  fractions  and  the  total  chips. 

Eleven  parameters  are  analyzed  to  get  the  system  response:  yi  cel- 
lulose yield,  calculated  for  completely  dry  wood,  %;  y2  final  contents 
of  lignin  in  cellulose,  %;  y3  degree  of  delignification  of  cellulose  per 
permanganate  number;  ys  contents  of  noncooked  part  in  cellu- 
lose, %;  ys  pressure  decrease  by  blowdown,  kp/cm2;  y?  resistance  to 
tearing,  g;  y8  length  of  cutting,  m;  y9  number  of  double  bonds; 
yio  lagging  activity  of  black  lye,  g/1;  yn  density  of  black  lye,  g/cm3. 
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To  perform  the  experiment,  a 1/8  replica  of  type  27  3 (16  trials) 
has  been  chosen  with  these  defining  contrasts: 

1=X1X2X3X4=X1X2XSX(,=X1X3XSX7=X1X4X6X7=X2X3X6X7 

=X2X4X5X7=X3X4X5X6 

All  linear  effects  and  the  effects  of  even  interactions  are  being 
estimated  through  such  a design  of  experiment: 

X1X2=X3X4=X5X6; 

X1X3=X2X4=XSX4; 

X1X4=X2X3=X6X4; 

X1XS=X2X6=X3X4; 

X1X6=X2X3=X4X7; 

X1X7=X3XS=X4X6; 

X2X7=X4XS=X3X6; 

The  associated  regression  is: 

y — b^  + by  X,  + b2 X2  by X5  + 1?4X4  by  Xy  + by Xg  + byXy  + 

bi2XiX2  + by  3X3X3  + b14XjX4  + b15XyX5  + b16XyX6  + b17XyXy 
+ b27X2X7  + byyyXyXyXy 

Conditions  of  varying  all  factors  with  the  design  matrix  are 
shown  in  Table  2.131.  Determine  the  values  of  the  regression  coeffi- 
cients. 
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Table  2.131  Fractional  factorial  design  2?  3 


No.  Design  matrix  Responses 


X, 

x2 

X3 

X4 

X5 

X6 

X7 

X X 

X X 

X x” 

Xi 

X5 

X, 

X6 

X, 

X7 

X2 

X7 

Xi 

X2 

X7 

yi 

Y2 

ys 

y4 

ys 

ye 

y? 

ys 

Y9 

yio 

yn 

1 

- 

- 

- 

- 

- 

- 

- 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

- 

64.5 

12.05 

63.5 

134.1 

81.0 

7.5 

100 

8.92 

0.61 

12.4 

1.104 

2 

+ 

+ 

+ 

+ 

- 

- 

- 

+ 

+ 

+ 

- 

- 

- 

- 

- 

42.7 

2.54 

24.8 

73.3 

0.0 

5.3 

107 

8.07 

1.17 

13.4 

1.078 

3 

+ 

+ 

- 

- 

+ 

+ 

- 

+ 

- 

- 

+ 

+ 

- 

- 

- 

56.5 

2.64 

25.2 

119.2 

8.2 

6.1 

78 

8.50 

0.42 

22.8 

1.152 

4 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

- 

- 

67.8 

16.98 

80.0 

135.3 

70.0 

6.9 

121 

9.60 

0.40 

20.0 

1.073 

5 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

- 

+ 

- 

+ 

+ 

- 

- 

59.2 

3.50 

28.6 

116.0 

0.0 

8.1 

124 

10.31 

0.69 

24.3 

1.146 

6 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

- 

57.5 

2.39 

24.0 

94.3 

34.5 

9.5 

124 

9.93 

1.01 

3.1 

1.058 

7 

- 

+ 

- 

+ 

+ 

- 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

- 

53.3 

7.87 

47.0 

145.0 

4.9 

10.4 

110 

9.63 

3.40 

7.8 

1.117 

8 

- 

- 

+ 

+ 

+ 

+ 

- 

+ 

- 

- 

- 

- 

+ 

+ 

- 

75.0 

19.61 

85.0 

96.6 

70.0 

7.9 

110 

8.39 

0.64 

18.6 

1.051 

9 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

+ 

53.2 

2.07 

23.0 

77.8 

2.9 

6.6 

116 

9.90 

0.83 

26.0 

1.151 

10 

+ 

- 

+ 

- 

- 

+ 

- 

- 

+ 

- 

- 

+ 

- 

+ 

+ 

69.6 

10.33 

56.4 

126.5 

58.0 

6.8 

122 

9.54 

0.82 

19.6 

1.072 

11 

+ 

- 

- 

+ 

+ 

- 

- 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

51.5 

6.58 

42.0 

144.0 

0.0 

8.2 

105 

10.18 

1.56 

35.6 

1.154 

12 

- 

+ 

+ 

- 

+ 

- 

- 

- 

- 

+ 

- 

+ 

+ 

- 

+ 

54.0 

11.78 

62.4 

141.5 

15.8 

8.0 

114 

9.01 

1.28 

9.3 

1.058 

13 

- 

+ 

- 

+ 

- 

+ 

- 

- 

+ 

- 

+ 

- 

+ 

- 

+ 

51.1 

2.72 

25.4 

99.5 

3.1 

7.9 

96 

11.41 

1.17 

6.2 

1.088 

14 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

+ 

- 

- 

+ 

61.0 

7.04 

43.8 

116.1 

70.0 

9.3 

104 

11.84 

1.00 

8.3 

1.054 

15 

- 

- 

- 

- 

+ 

+ 

+ 

+ 

+ 

+ 

- 

- 

- 

- 

+ 

83.5 

18.95 

84.0 

140.6 

80.0 

8.3 

100 

9.55 

1.01 

18.6 

1.109 

16 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

45.0 

2.75 

25.5 

80.6 

0.0 

9.1 

117 

10.79 

1.50 

17.3 

1.076 

17 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

56.0 

4.73 

33.6 

122.0 

2.4 

8.6 

112 

11.39  0.73 

14.0 

1.092 

18 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

53.1 

5.00  34.8 

138.0 

2.7 

9.8 

108 

10.57  0.97 

14.0 

1.096 

19 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

56.8 

5.14  35.2 

92.0 

8.5 

9.5 

103 

11.28 

1.05 

18.6 

1.084 

20 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

51.1 

5.60  37.2 

131.6 

0.0 

8.2 

114 

10.97 

1.85 

14.4 

1.092 

2.3.1 .2  Optimality  Criterion  for  Experimental  Design 

When  designing  an  experiment,  a researcher  often  does  not  know  in  advance  where 
in  the  response  surface  the  optimum  is  situated  and  what  the  shape  of  that  surface 
is.  He  therefore  tends  to  choose,  first  of  all,  such  a design  of  experiment  that  which 
will  guarantee  him  maximum  information  in  the  hardest  possible  situation  with  a 
relatively  small  number  of  design  points.  It  is  natural  that  in  such  situations  there 
appears  a need  to  estimate  optimality  designs  by  using  special  criteria.  It  has,  first  of 
all,  been  necessary  to  determine:  what  is  an  optimal  design?  George  E.  P.  Box  and 
his  school  are  characterized  by  an  empirical  intuitive  approach  of  choosing  criteria 
for  design  optimization.  First  an  orthogonal  and  later  a rotatable  have  been  suggested 
as  optimal  designs.  The  sense  of  these  criteria  could  easily  be  understood  intuitively 
by  those  researchers  who  tended  to  logically  think  out  the  experimental  methodolo- 
gy. Interesting  theoretical  results  have  been  obtained  from  Box’s  works,  especially 
important  for  linear  designs,  and  originating  from  properties  of  orthogonality  and 
rotatability. 
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Full  factorial  experiment  and  regular  fractional  replicas  as  designs,  are  most  effi- 
cient for  obtaining  linear  models.  Good  properties  of  these  designs  consist  exactly  of 
orthogonality,  rotatahility,  symmetry  with  reference  to  experimental  center  and  concor- 
dance with  conditions  of  norming.  These  properties  of  designs  of  experiments  may 
be  expressed  mathematically  thus  in  this  way: 


E xiuxju  = 0 for  i,j=0,l,...,k; 

U—  1 

(2.71) 

N 

Ex;u  = ° for  i=l,2,...,k; 

U=1 

(2.72) 

N 

X?u  = N for  i=l,2,...,k 

U=  1 

(2.73) 

where: 

k is  number  of  factors  (number  of  the  last  design  column); 

N is  number  of  design  points-trials  (number  of  design  rows); 
u is  current  number  of  design  points-trials. 

Mathematical  expression  (2.71)  corresponds  to  the  condition  of  orthogonality  if 
the  sum  of  multiplied  design  signs  of  any  two  columns  equals  zero.  Orthogonality 
of  the  design  facilitates  the  calculation  of  independent  estimations  of  regression 
coefficients.  This  means  dropping  all  those  factors  of  a design  of  experiments  for 
which  we  have  regression  coefficients  and  that  are  statistically  insignificant.  The 
expression  (2.72)  corresponds  to  the  condition  of  symmetry  of  design  with  reference 
to  the  experimental  center.  Norming  of  a design  of  experiment  has  been  defined  by 
relation  (2.73).  Regression  coefficients  from  designs  of  experiments  that  satisfy  con- 
ditions (2.71)— (2.73),  are  estimated  with  a minimal  variance  (each  regression  coeffi- 
cient is  estimated  based  on  all  N trials  with  a variance  that  is  N times  smaller  than 
the  variance  or  error  of  the  replicated  trials). 

A design  of  experiment  is  called  rotatable  if  it  is  insensitive  to  the  rotation  of  coor- 
dinate axes  with  reference  to  the  experimental  center,  or  if  movement  of  the  experi- 
mental center  in  any  direction  is  equivalent.  Designs  of  experiments  that  satisfy 
both  orthogonality  and  rotatability  do  not  have  minimal  variances  in  determining 
regression  coefficients  only,  but  they  are  identical  too.  Orthogonal  and  rotatable  may 
only  be  designs  of  first  order;  the  optimality  problem  for  second-order  designs 
becomes  much  more  complicated.  Therefore  the  first  design  for  second-order  mod- 
els were  orthogonal,  and  later  rotatable.  In  recent  years,  the  concept  of  D -optimality, 
developed  by  D.  Kiefer,  has  appeared.  He  thinks  that  efficiency  of  estimates  is  not 
determined  only  by  optimal  way  of  processing  experimental  results  but  is  also  con- 
ditioned by  the  optimal  distribution  of  a design  point-trials  in  factor  [34].  It  has  been 
suggested  to  include  these  properties  into  an  optimality  design: 

• maximal  variance  value  in  estimating  a response  or  optimization  criterion 
value; 

• volume  of  elliptical  dispersion  of  parameter  estimates. 
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The  designs  are  called  D-optimal  if  the  volume  of  elliptical  dispersion  of  para- 
meter estimates  is  minimal.  D-optimal  designs  correspond  to  designs  that  minimize 
the  variance  of  response  estimate  (yj  in  the  associated  space.  In  practice,  it  is  diffi- 
cult to  find  a design  that  simultaneously  satisfies  several  optimality  criteria.  It  is 
therefore  recommended  in  each  individual  case  to: 

• choose  an  optimality  criterion; 

• choose  the  most  suitable  design  for  the  actual  case. 

A basic  requirement  in  constructing  designs  of  experiments  is  reduction  of  the 
number  of  experimental  trials.  Table  2.132  shows  the  total  number  of  trials  (N)  for 
various  second-order  designs  at  a different  number  of  factors. 

Table  2.132  Number  of  trials  in  second-order  designs 


Number  of 
factors 

Number  of  N design  points-trials 

Number  of 
regression 
coefficients 

Orthogonal 

design 

Rotatable 

design 

Hartley 

design 

Kiefer  design 

Kono  design 

2 

9 

13 

7 

9 

9 

6 

3 

15 

20 

11 

26 

21 

10 

4 

25 

31 

17 

72 

49 

15 

5 

27* 

32* 

27* 

192 

113  (88) 

21 

6 

45 

53 

29 

- 

257 

28 

7 

79 

92 

47 

- 

577 

36 

* with  halfreplica 


It  is  evident  from  Table  2.132  that  as  far  as  the  number  of  trials  is  concerned  the 
most  economical  are  Hartley’s  designs.  However,  attention  should  also  be  paid  to  or- 
thogonal and  rotatable  ones. 

Hartley’s  design  with  only  seven  trials  is  convenient  for  k=2.  A Konos  design  may 
also  be  good  sometimes,  and  if  the  number  of  design  points  is  not  limited,  we  may 
use  Box’s  rotatable  design  (N=13).  Good  results  are  also  achieved  by  simplex-sum- 
mary designs. 

Rotatable  design  is  recommended  for  k=3.  The  properties  of  Hartley’s  and  orthog- 
onal designs  are  worse,  but  they  may  be  used  when  it  is  necessary  to  keep  a minimal 
number  of  design  points. 

Design  B4,  which  requires  only  24  trials,  is  recommended  for  k=4.  The  design  is 
symmetrical  and  has  certain  advantages  to  the  D-optimal  design.  There  is  sense  in 
using  Hartley’s  design  too. 

Hartley’s  design  with  only  27  trials  should  first  of  all  be  used  for  k=5.  Box’s  rotata- 
ble design  also  deserves  attention.  A comparison  of  rotatable  designs  of  second 
order  with  D-optimal  and  other  designs  shows  that  a rotatable  design  may  be 
applied  where  limits  of  an  experimental  region  are  given  by  a sphere,  i.e.  in  cases 
when  a researcher  is  only  interested  in  the  response  surface  in  the  vicinity  of  the 
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experimental  center.  Deficiency  of  rotatable  designs  is  in  experimental  design 
points,  not  falling  into  cube  crowns  but  onto  the  surface  of  the  sphere  drawn  in  the 
cube.  This  means  that  the  cube  crowns  are  not  used,  which  greatly  influences  the 
accuracy  of  the  obtained  model.  With  an  increase  in  the  number  of  factors,  the  vol- 
umes of  unused  cube  crowns  increase,  so  that  for  k>5  it  is  not  recommended  to  use 
rotatable  designs. 

Rotatable  designs  are  most  efficient  for  k=3.  Rotatable  designs  of  second  order  are 
not  orthogonal  and  they  do  not  minimize  the  variance  of  estimates  of  regression 
coefficients.  They  are  efficient  in  solving  research  problems  when  trying  to  find  an 
optimum. 

Rotatable  and  orthogonal  designs  are  convenient  for  application  in  composite 
designing  when  designs  of  second  order  are  constructed  after  doing  the  experiment 
by  design  of  full  factorial  experiment  or  fractional  replica.  Hartley’s  designs  offer  a 
possibility  to  reduce  the  number  of  trials  but  they  are  not  orthogonal  or  rotatable. 
Designs  of  third  order  are  rarely  met  in  practice  as  they  require  a large  number  of 
design  points. 

Summary 

The  first  series  of  design  points-trials  of  a basic  design  is  preceded  by  numerous 
activities  meant  to  select  the  local  domain  of  factor  space.  Thereby,  limits  of  factors 
space  that  in  principle  determine  the  limitations,  technoeconomic  possibilities  and 
concrete  conditions  for  doing  the  process,  are  being  estimated.  The  factor  space 
requires  careful  analysis  of  preliminary  information  on  the  scope  of  response 
change  and  on  curvature  of  the  response  surface. 

The  local  domain  of  doing  an  experiment  is  determined  in  two  stages:  determin- 
ing the  basic  level  and  the  variation  interval.  The  basic  (null)  level  is  a multidimen- 
sional point  in  factor  space,  given  by  a combination  of  factor  levels.  Construction  of 
a design  of  experiments  is  brought  down  to  selection  of  experimental  or  design 
points,  symmetrical  with  reference  to  a basic  level.  When  defining  the  basic  level  it 
is  obligatory  to  take  into  account  the  information  on  the  “best  known”  point  for  per- 
forming the  process,  if  such  information  exists. 

In  the  next  stage,  two  variation  levels  for  varying  factors  in  the  experiment  are 
determined  for  each  factor.  One  variation  level  is  called  the  lower  and  the  other  one 
the  upper  level.  The  number  that,  when  added  to  the  basic  level  gives  the  upper  and 
when  subtracted  the  lower  level,  is  called  the  variation  interval  of  the  associated  fac- 
tor. To  simplify  the  way  of  recording  conditions  of  doing  an  experiment  and  process- 
ing experimental  results,  axes  ratios  are  such  that  the  value  +1  corresponds  to  the 
upper,  -1  to  the  lower  and  zero  to  the  basic  level. 

Two  limitations  are  imposed  on  the  selection  of  the  factor-variation  interval:  lower 
(it  may  not  be  smaller  than  the  error  of  fixing  a factor)  and  upper  (upper  and  lower 
levels  cannot  be  outside  the  factor  space).  In  trying  to  find  the  optimum  in  an 
experimental  region,  a subregion  should  be  such  as  to  permit  the  procedure  of  mov- 
ing towards  the  optimum  step  by  step.  When  determining  a model  or  interpolation, 
variation  intervals  include  the  complete  factor  space. 
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When  determining  the  variation  interval,  we  should  take  into  account  the  accu- 
racy of  fixing  a factor,  curvature  of  response  surface  and  the  range  of  the  response 
changing.  A researcher  is  greatly  helped  by  twenty  seven  situations  in  determining 
a factor-variation  interval.  A low  accuracy  in  fixing  a factor,  by  rule,  requires  a wide 
variation  interval.  An  average  variation  interval  corresponds  to  an  average  accuracy 
of  fixing  a factor.  A high  accuracy  of  fixing  a factor  leads  to  a narrow  and  average 
variation  interval. 

An  experiment  where  all  possible  combinations  of  factor  or  design-point  levels 
are  realized  is  called  a full  factorial  experiment.  When  the  number  of  variation  levels 
of  all  factors  is  two,  we  have  a full  factorial  experiment  of  2k.  Conditions  for  perform- 
ing an  experiment  are  presented  in  tables-design  matrices  where  the  rows  corre- 
spond to  different  design  points-trials  and  the  columns  to  factors.  The  design  of  full 
factorial  22  has  a geometric  interpretation  shaped  as  a square  where  crowns  corre- 
spond to  conditions  for  doing  a trial.  FUFE  23  has  the  shape  of  a cube  and  for  k>3 
we  have  a hypercube.  A type  2k  full  factorial  experiment  has  the  properties  of  sym- 
metry, norming,  orthogonality  and  rotatability.  Regression  coefficients  calculated 
from  the  outcomes  of  experiments  indicate  by  their  values  the  degree  to  which  the 
factors  affect  the  system  response.  The  effect  of  factors  equals  double  the  value  of  a 
regression  coefficient.  We  can  speak  about  existence  of  interaction  between  two  fac- 
tors in  cases  when  the  effect  of  one  factor  depends  on  the  level  of  the  other  factor.  To 
determine  numerical  values  of  interactions,  a column  of  products  of  associated  fac- 
tors is  constructed,  which  is  then  manipulated  as  any  other  column.  The  informa- 
tion on  the  square  members  may  not  be  drawn  out  of  a full  factorial  experiment. 
Columns-vectors  for  square  members  coincide  both  among  themselves  and  with 
column  X0.  The  value  of  free  member  bo  includes  the  effects  of  square  members  so 
that  it  is  basically  an  aliased/confounded  estimate.  Estimates  of  other  coefficients 
are  not  aliased/confounded.  The  difference  between  the  number  of  design  points- 
trials  and  the  number  of  regression  coefficients  is  great  in  a full  factorial  experi- 
ment. Therefore  a need  appeared  to  construct  designs  with  a smaller  number  of 
trials  while  keeping  their  optimal  properties.  Designs  where  effects  of  interactions 
are  considered  negligible,  or  designs  where  they  were  replaced  by  new  factors,  are 
called  fractional  replicas.  The  calculated  regression  coefficients  are,  as  a rule,  esti- 
mates of  aliiased/confounded  effects. 

2.3.1 .3  Conclusion  after  Obtaining  Linear  Model 

By  processing  outcomes  of  a full  factorial  experiment,  we  often  obtain  an  adequate 
linear  model  that  looks  like  a polynomial  of  first  order.  Regression  coefficients  are 
the  associated  partial  derivation  of  a response  function  by  associated  factors.  As  is 
known,  geometrically  those  are  tangents  of  slope  angles  of  the  hypersurface  to  the 
corresponding  axis.  Higher  values  of  regression  coefficients  by  their  absolute  values 
correspond  to  a higher  slope  angle  or  to  a greater  response  change  at  the  change  of 
the  associated  factor.  This  kind  of  analysis  of  a regression  model  corresponds  to  the 
abstract  language  of  mathematics.  Its  translation  into  the  language  of  a researcher 
is  called  model  interpretation. 
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Interpretation  of  a regression  model  is  complex  and  it  is  solved  in  several  stages. 
The  first  stage  is  as  follows.  A degree  of  effect  of  each  factor  on  response  is  deter- 
mined. The  value  of  regression  coefficient  is  exactly  the  quantitative  measure  of 
such  an  effect.  The  higher  this  value  is,  the  stronger  the  effect  of  the  observed  factor 
is.  The  regression-coefficient  sign  gives  the  property  of  factor  effect.  A positive  sign 
indicates  that  with  an  increase  in  factor  value  there  is  an  increase  in  response  value, 
and  a negative  sign  means  a decrease  in  response  value.  An  interpretation  of  the 
regression  coefficient  signs  for  research  problems  of  finding  the  optimum  depends 
on  whether  we  are  looking  for  the  maximum  or  minimum  of  the  response  function. 
When  searching  for  this  maximum,  an  increase  of  values  of  all  factors  having  posi- 
tive regression  coefficients  is  the  thing  looked  for,  while  an  increase  in  factor  values 
of  negative  regression  coefficients  will  not  contribute  to  finding  the  response  maxi- 
mum. When  the  minimum  of  a response  function  is  searched  for,  the  situation  is 
completely  contrary.  One  can  only  say  that  for  the  given  variation  intervals  and  error 
reproducibility  of  an  experiment  they  have  no  important  effect  on  response. 

A change  in  variation  intervals  causes  changes  in  regression  coefficients.  Abso- 
lute values  of  regression  coefficients  increase  with  an  increase  in  variation  intervals 
of  associated  factors.  An  increase  in  a factor-variation  interval  does  not  change  the 
signs  of  linear  regression  coefficients.  They  may,  however,  be  changed  if  when  mov- 
ing along  the  response  function  gradient  they  “jump  over”  the  extreme. 

In  some  cases  a regression  model  with  real-natural  factor  dimensions  is  sought. 
In  that  case,  we  switch  from  a coded  regression  model  to  a real  one  using  transfor- 
mational expression  (2.59).  Values  of  regression  coefficients  also  change  with  such 
transformations.  Now  the  possibility  of  interpreting  factor  effects  on  the  basis  of 
size  and  the  coefficient  sign  is  lost.  Columns  or  vectors  of  real  factor  values  in  the 
design  of  experiment  matrix  are  no  longer  orthogonal,  so  that  the  calculation  of 
regression  coefficients  is  not  independent  either.  A regression  model  with  real  fac- 
tors is  acceptable  only  if  the  research  problem  of  obtaining  the  interpolation  model 
with  real  factors  is  defined. 

This  knowledge  offered  a basis  for  a switch  to  the  next  stage  of  interpolation  of 
the  regression  model.  Former  conclusions  give  an  idea  on  the  character  of  factor 
influence  on  response.  Sources  of  such  conclusions  can  be:  a theory  of  the  studied 
phenomenon,  experiments  with  similar  phenomena  or  previous  design  points,  etc. 
If  with  an  increase  in  temperature  we  expect  an  increase  in  response  value,  and  the 
regression  coefficient  has  a negative  sign,  a contradiction  is  present  and  it  must  be 
solved.  Two  causes  of  this  contradiction  are  possible:  an  error  in  the  experiment  or 
incorrect  previous  information,  i.e.  knowledge  about  the  observed  phenomenon. 
One  should  remember  that  the  experiment  is  done  in  the  local  domain  of  factor 
space  so  that  the  obtained  regression  coefficients  express  factor  effects  exactly  in 
that  part  of  factor  space.  Extrapolation  of  those  effects  on  the  rest  of  factor  space  is 
problematic.  Theoretical  knowledge  usually  has  a more  general  character  so  that  it 
may  contradict  the  outcomes  from  the  local  domain  of  factor  space.  Prior  knowledge 
is  often  based  on  single-factor  dependencies  so  that  the  situation  may  change  when 
transferring  into  a multifactor  space.  This  contradiction  may  be  solved  by  establish- 
ing various  hypotheses  and  their  experimental  checks. 
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In  rare  situations  when  we  dispose  with  sufficient  prior  theoretical  knowledge  on 
the  observed  phenomenon,  one  may  establish  hypotheses  on  mechanisms  of  the  ob- 
served phenomenon,  which  is  the  next  stage  in  interpretation  of  the  model.  It  con- 
sists of  checking  the  set  hypotheses  on  the  mechanism  of  the  phenomenon  and  set- 
ting up  new  hypotheses.  In  this  case  special  attention  is  paid  to  the  effects  of  factor 
interactions  and  their  interpretations. 

Assume  that  in  research  we  have  obtained  a statistically  significant  interaction  of 
two  factors  with  a positive  sign.  This  means  that  simultaneous  increase  or  decrease 
of  both  factors  brings  about  a response  increase  (excluding  linear  effects).  If  the  dou- 
ble interaction  has  a negative  sign,  then  any  combination  of  simultaneous  increase 
or  decrease  of  a factor  causes  an  increase  in  response  value.  One  should  remember 
the  rule:  if  a double  interaction  has  a positive  sign,  then  an  increase  in  response  is 
obtained  by  simultaneously  increasing  or  decreasing  the  factor  value.  To  reduce  a 
response  it  is  necessary  to  change  simultaneously  both  factors  in  opposite  direc- 
tions. When  a double  interaction  has  a negative  sign,  an  increase  in  response  value 
is  obtained  by  simultaneous  changes  of  both  factors  in  opposite  directions.  To 
reduce  a response,  it  is  sufficient  to  increase  or  decrease  both  factors. 

It  is  evident  that  an  interpretation  of  the  interaction  effect  is  not  so  singular  as  is 
the  case  with  linear  effects.  In  any  case,  both  variants  are  at  our  disposal,  and  the 
question  is  which  variant  should  get  an  advantage.  One  should  primarily  take  into 
account  linear  effects  of  the  associated  factors.  If  the  effect  of  a double  interaction 
has  a plus  sign  and  the  associated  linear  factor  effects  a minus  sign,  then  negative 
factor  values  are  chosen  (X2  = -1;  X2  = -1).  A case  of  different  linear  effect  sign  is  also 
possible;  then  regression  coefficient  values  are  compared  and  the  lower  value  factor 
is  sacrificed.  Sometimes  in  such  a situation,  the  factor  whose  changes  are  more  dif- 
ficult to  be  done  in  an  experiment  or  which  cause  higher  expense,  is  sacrificed.  An 
interpretation  is  more  complex  in  the  case  of  statistical  significance  of  the  triple 
X^Xs.  Such  an  interaction  may  have  a plus  sign  if  the  signs  of  an  even  number  of 
factors  are  negative.  A triple  interaction  effect  will  have  a minus  sign  if  an  uneven 
number  of  factors  has  the  same  sign.  Such  an  interpretation  may  be  generalized  for 
interactions  of  any  order.  An  approach  is  often  used  to  consider  the  yield  of  two  fac- 
tors, conditionally,  a single  factor,  so  that  a triple  interaction  is  brought  down  to  an 
even-double  one. 

It  has  been  mentioned  that  an  interpretation  of  outcomes  means  a transfer  from 
one  language  into  another  one.  Such  a transfer  facilitates  an  understanding  between 
statisticians  and  researchers  who  jointly  work  in  the  study  of  system  optimization.  A 
regression  model  interpretation  is  not  significant  only  for  an  understanding  of  the 
mechanism  of  the  process  but  also  for  drawing  conclusions  about  solving  the  prob- 
lem of  optimization. 

Conclusions  After  Obtaining  a Mathematical  Model 

Drawing  conclusions  after  processing  experimental  outcomes  may  be  very  complex 
depending  on  the  number  of  factors,  on  the  type  of  experiment  or  level  of  research 
objective  (screening  of  factors  by  significance  of  their  effect  on  response,  mathema- 
tical modeling  and  optimization  of  the  process).  Complexity  of  obtained  models  and 
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the  number  of  conclusions  that  may  be  drawn  from  them  are  extremely  high  and 
we  will  therefore  limit  our  analysis  to  typical  cases.  It  will  differ  in  the  lack  of  fit  or 
nonlack  of  fit  of  the  model,  in  significance  and  insignificance  of  regression  coeffi- 
cients and  in  information  on  position  of  the  optimum. 

Adequate  linear  model 

Three  cases  are  possible  in  this  situation: 

• all  regression  coefficients  are  statistically  significant; 

• some  regression  coefficients  are  statistically  significant; 

• all  regression  coefficients  are  statistically  insignificant. 

In  any  of  these  cases  an  optimum  may  be  close  by,  far  away  or  there  is  no  infor- 
mation on  its  position,  i.e.  the  position  of  the  optimum  is  not  defined. 

When  the  optimum  region  is  close  by,  three  solutions  are  possible: 

• end  of  research; 

• transfer  to  designs  for  second-order  models; 

• movement  along  the  gradient  of  response  function  towards  optimum. 

Transfer  to  design  of  experiments  for  second-order  models  enables  a mathemati- 
cal description  of  the  optimum  and  discovery  of  the  extreme.  Design  of  experiments 
from  the  second-order  models  are  the  subject  of  the  next  chapter.  Movement  along 
the  gradient  is  applied  in  the  case  of  a small  error  in  the  trial,  as  it  is  hard  to  estab- 
lish the  response  yield  for  a large  error.  When  an  optimum  is  undefined  or  too  far, 
the  solution  lies  in  moving  along  the  gradient. 

The  second  case  some  regression  coefficients  are  statistically  significant.  Since 
movement  to  an  optimum  along  the  gradient  is  most  efficient  when  all  regression 
coefficients  are  important,  one  must  choose  the  solution  that  makes  all  regression 
coefficients  statistically  significant.  One  must  therefore  set  up  the  hypothesis  that 
explains  the  insignificance  of  individual  coefficients.  Insignificance  of  regression 
coefficients  may  be  a consequence  of  wrong  choice  of  variation  intervals,  inclusion 
of  factors  that  do  not  affect  the  response,  (being  cautious),  large  error  of  trials,  etc. 
The  first  doubt  is  removed  by  an  increase  of  variation  intervals  for  insignificant  fac- 
tors and  performance  of  a new  sequence  of  design  points.  An  increase  in  variation 
intervals  is  usually  combined  with  a shift  of  the  experimental  center  to  the  point 
with  the  best  response  value.  Insignificant  factors  are  fixed  and  excluded  from  the 
next  experiment.  The  solutions  for  obtaining  significant  regression  coefficients  are: 

• increase  in  number  of  replicated  trials; 

• upgrade  the  design  of  experiment. 

An  increase  in  the  number  of  replicated  trials  causes  a decrease  in  reproducibility 
variance  or  experimental  error  as  well  as  in  the  associated  variances  of  regression 
coefficients.  Design  points-trials  can  be  replicated  in  all  points  of  the  experiment  or 
in  some  of  them.  An  upgrade  of  the  design  of  experiment  may  be  realized  by  a shift 
from  fractional  to  full  factorial  experiment,  a switch  to  bigger  replica  (from  1/6  to 
1/2  replica) , a switch  to  second-order  design  (when  the  optimum  region  is  close  by) , etc. 


All  reg.  coefficients-significant  ^ 1 j Few  reg.  coefficients  significant  j j All  reg.  coefficients-insignificant 
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Figure  2.37  Block  diagram  of  search  for  an  optimum  for  an  adequate  linear  model 
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Realization  of  these  solutions  requires  additional  experiments,  so  that  it  is  some- 
times unnecessary  to  follow  blindly  the  rules  of  movement  to  optimum  by  all  the 
factors  but  to  do  it  by  only  those  factors  that  are  statistically  significant. 

Finally,  if  an  optimum  is  close  by,  one  should  proceed  as  in  the  case  when  all 
regression  coefficients  are  statistically  significant.  Let  us  analyze  the  last  case:  ade- 
quate linear  model,  all  regression  coefficients  statistically  insignificant  (except  b0). 
This  case  mostly  occurs  as  a result  of  either  a larger  experimental  error  or  small  fac- 
tor-variation intervals.  Possible  solutions  refer  to  both  an  increase  in  accuracy  of  the 
experiment  and  factor- variation  intervals.  An  increase  in  accuracy  of  the  experiment 
may  be  achieved  in  two  ways:  by  an  improvement  in  methodology  of  carrying  out 
and  by  an  increase  in  the  number  of  replicated  trials.  The  conclusion  of  this  analysis 
is  shown  by  the  block  diagram  for  the  case  of  an  adequate  linear  Fig.  2.37. 

Example  2.39  [35] 

When  determining  optimal  conditions  for  the  technological  process  of  obtaining 
man-made  fibers  from  polypropylene,  these  factors  have  been  chosen:  Xrtempera- 
ture  of  melt,  °C;  X2-pressure  of  melt,  kp/cm2;  X3-speed  of  winding  up  on  roll, 
m/min;  X4-heating  temperature,  °C;  X5-speed  of  drawing-out,  m/min  and  X6-brevity 
of  drawing-out.  Optimization  parameter  has  been  the  fiber  tensile  strength.  Experi- 
mental conditions,  design  matrix  and  outcomes  of  trials  are  given  in  Table  2.133.  In 
this  example,  1/8  replica  of  full  factorial  experiment  26  has  been  used  with  these 
generating  ratios: 

X4=X1X2X3,  X5=-XxX3  and  XS=-X2X3 


Table  2.133  Fractional  factorial  design  26  3 


Name 

Xl 

x2 

*3 

x4 

x5 

x6 

Basic  level 

300 

50 

2.40 

150 

0.35 

7.2 

Variation  level 

10 

7 

0.47 

5 

0.12 

0.3 

Upper  level 

310 

57 

2.87 

155 

0.47 

7.5 

Lower  level 

290 

43 

1.93 

145 

0.23 

6.9 

Trials 

Design  matrix 

Response 

X, 

X2 

X, 

x4 

x5 

x6 

y 

1 

- 

- 

- 

- 

- 

- 

53.4 

2 

+ 

- 

+ 

- 

- 

+ 

65.3 

3 

+ 

- 

- 

+ 

+ 

- 

54.2 

4 

- 

+ 

+ 

- 

+ 

- 

56.2 

5 

- 

+ 

- 

+ 

- 

+ 

52.8 

6 

+ 

+ 

+ 

+ 

- 

- 

52.2 

7 

+ 

+ 

- 

- 

+ 

+ 

65.1 

8 

- 

- 

+ 

+ 

+ 

+ 

52.8 
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These  regression  coefficients  were  obtained  after  processing  the  outcomes: 

b0=56.500  b3=0.125  b6=2.500 

b1=2.700  b4=-3.500  Stpl.060 

b2=0.0749  b5=0.575 

The  obtained  linear  regression  equation  is  adequate.  Three  out  of  six  regression 
coefficients  are  statistically  significant  (b1;  b4  and  b6).  There  is  no  information  on 
the  position  of  the  optimum.  Two  variants  are  feasible  in  this  situation: 

• to  move  along  the  gradient  towards  optimum; 

• to  increase  variation  intervals  of  insignificant  factors. 

Analyze  the  first  variant.  The  movement  to  optimum  when  we  have  only  three 
out  of  six  significant  regression  coefficients  may  be  inefficient.  Besides,  in  a 1/8-rep- 
lica  of  a full  factorial  experiment,  effects  are  greatly  aliased/confounded  up  and  this 
mixture  is  not  excluded,  so  that  significant  regression  coefficients  may  be  estimates 
of  aliasing  effects  of  more  significant  factors.  On  the  other  hand,  an  increase  in  fac- 
tor-variation intervals  requires  new  design  points  that  are  very  expensive.  Moving 
along  the  gradient  we  risk  performance  of  two  to  three  trials  only.  Therefore,  accept- 
ing the  variant  of  movement  to  optimum  along  the  gradient  seems  reasonable. 

The  second  variant,  an  increase  in  variation  intervals  of  insignificant  factors  with 
additional  design  points,  is  acceptable  if  movement  to  optimum  appears  to  be  ineffi- 
cient. A change  in  variation  intervals  will  require  at  least  eight  expensive  design 
points. 

Example  2.40  [36] 

In  a process  of  separating  elements  of  rare  earths  on  the  principle  of  ionic  exchange 
in  solution  of  amino  di  carboxylic  acid,  percentile  contents  of  neodymium  in  the 
outlet  solution  has  been  used  as  response.  Only  two  factors  have  been  analyzed  dur- 
ing the  research:  Xrconcentration  of  inlet  solution,  % and  X2-pH  of  inlet  solution. 

The  domain  of  factors  was:  X:  e 0.5;  3.0  and  X2  e 3;  8.  Previous  knowledge  indicates 
that  the  accuracy  of  fixing  the  factor  is  average,  the  response  surface  linear,  and  the 
range  of  response  change  narrow  enough.  Therefore,  to  do  the  experiment  accord- 
ing to  block  Fig.  2.12  a wide  factor  variation  interval  Ax4=0.5  and  Ax2=1.0,  has  been 
chosen,  which  is  20%  of  the  factor  space.  Values  x10=1.5  and  x20=7.0  have  been  cho- 
sen for  the  basic  level.  The  experiment  has  been  a full  factorial  one,  and  the  out- 
comes are  shown  in  Table  2.134. 

These  regression  coefficients  are  obtained  by  the  processing  of  results: 
b0=88.0;  b^-2.0;  b2=-4.5;  b12=0.5 

The  regression  equation  is  adequate  and  has  this  form: 
y = 88.0  — 2.0X4  — 4.5X2;  Sb  = 0.30 

All  regression  coefficients  are  statistically  significant  and  the  optimum  region  is 
close  by,  so  that  the  best  obtained  response  value  is  y=95%.  The  research  objective  is 
to  obtain  99  to  100%  of  neodymium  with  a minimal  number  of  trials. 
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Table  2.134  Full  factorial  experiment  22 


Number 
of  trials 

Yates 

Factors 

Response 

X0 

X, 

x2 

x,x2 

y 

1 

(1) 

+ 

- 

- 

+ 

95 

2 

a 

+ 

+ 

- 

- 

90 

3 

b 

+ 

- 

+ 

- 

85 

4 

ab 

+ 

+ 

+ 

+ 

82 

The  solution  to  this  problem  may  be  one  of  the  variants: 

• movement  along  the  gradient  to  optimum; 

• end  of  further  research; 

• upgrading  to  a design  for  second-order  model. 

The  first  variant  of  movement  along  the  gradient  is  most  acceptable,  as  an 
increase  in  outlet  for  several  per  cent  with  two  to  three  additional  design  points  is 
very  important  for  the  mentioned  procedure  of  ionic  change.  This  solution  is  even 
more  acceptable  when  we  know  that  an  upgrading  to  the  existing  design  towards 
the  second-order  model  would  require  at  least  five  trials.  The  second  variant  that 
entails  ending  of  any  further  research,  is  unacceptable  because  the  increase  in  out- 
lets for  several  per  cent  is  eventually  economically  advantageous.  The  third  variant  is 
also  unacceptable  as  it  requires  more  trials  than  the  first  one. 

To  finalize  the  analysis  on  drawing  a conclusion  after  obtaining  an  adequate 
model,  it  should  be  pointed  out  that  the  research  problem  of  obtaining  a model  or 
an  interpolation  formula  has  been  fulfilled  by  obtaining  an  adequate  model. 

Inadequate  linear  model 

If  a linear  model  is  inadequate  it  means  that  the  response  surface  is  not  approxi- 
mated to  the  plane.  Apart  from  Fisher’s  criterion,  which  is  there  to  judge  the  lack  of 
fit  of  a regression  model,  inadequacy  may  also  be  recognized  in  this  way: 

• at  least  one  interaction  is  statistically  significant; 

• sum  of  regression  coefficients  next  to  square  members  13..  is  statistically 
significant.  The  estimate  of  that  sum  is  given  by  the  difference  between  b0 
and  the  response  in  design  center-y0.  When  this  difference  is  greater  than  the 
experimental  error,  the  hypothesis  on  statistical  insignificance  of  regression 
coefficients  next  to  square  members  cannot  be  accepted.  One  should  remem- 
ber that  a sum  of  regression  coefficients  next  to  square  members  may  be  sta- 
tistically insignificant  although  the  square  effects  are  significant,  because 
they  may  have  different  signs. 

Differentiation  of  significant  and  insignificant  regression  coefficients  is  not 
important  for  inadequate  models.  To  obtain  an  adequate  model,  we  should: 

• change  the  factor  variation  intervals; 

• transfer  the  design  of  the  experiment  center  into  another  point; 

• upgrade  the  design  of  experiments. 
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A change  in  factor-variation  interval  is  the  most  usual  approach,  and  it  requires 
additional  trials.  One  may  sometimes  give  up  obtaining  an  adequate  model,  and  the 
possibility  of  movement  along  the  gradient  to  optimum  is  then  checked  at  the 
expense  of  additional  design  points.  To  move  to  an  optimum  along  the  gradient  of 
an  inadequate  model,  is  not  a correct  approach.  Movement  along  the  gradient  is  pre- 
ceded by  an  estimate  of  curvature  of  the  response  surface  (based  on  the  sum  of 
regression  coefficients  next  to  square  members)  and  a comparison  of  values  of  line- 
ar regression  coefficients  with  interactions  of  regression  coefficients.  When  the 
share  of  square  members  and  interaction  effects  is  not  great,  movement  to  optimum 
along  the  gradient  is  possible. 

There  is  another  solution  to  movement  to  optimum,  i.e.  when  interaction  effects 
are  included  in  an  inadequate  regression  model  and  this  movement  is  done  by  an 
incomplete  second-order  model.  In  that  case,  the  second-order  model  is  analyzed 
and  the  gradient  direction  changed  from  point  to  point. 

If  the  optimum  region  is  close  by,  the  research  by  this  model  ends  and  we  switch 
to  constructing  the  design  of  experiments  for  the  second-order  model.  Fig.  2.38 
shows  the  block  diagram  of  searching  for  an  optimum  for  an  inadequate  linear 
model. 


Linear  model-inadequate 


Figure  2.38  Block  diagram  of  search  for  optimum  for  an  inade- 
quate linear  model 


Example  2.41  [12] 

An  optimization  of  the  process  of  obtaining  a pharmaceutical  product  (carbo- 
methoxysulphonylguanadine)  has  been  done. 

The  system  factors  are:  Xrratio  of  solvent  to  basic  material,  g/1;  X2-temperature  of 
reaction  mixture,  °C  and  X3-reaction  time,  min.  Product  yield  in  per  cent  is  the  sys- 
tem response.  Experimental  conditions  and  outcomes  of  trials  are  given  in 
Table  2.135. 
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Table  2.135  Full  factorial  experiment  23 


Name 

*1 

*2 

x3 

Regression  coefficients 

Basic  level 

0.7 

135 

30 

b„=23.8; 

b3=1.78; 

b2=10.23 

Variation  interval 

0.2 

5 

15 

b3=9.36; 

b12=0.17; 

b13=-0.79 

Upper  level 

0.9 

140 

45 

b23=3.77; 

bm=1.00; 

SbP0.12 

Lower  level 

0.5 

130 

15 

Trials 

Factors 

Response 

X, 

X2 

x3 

1 

+ 

+ 

+ 

46.80 

2 

+ 

- 

+ 

20.47 

3 

- 

- 

+ 

16.80 

4 

- 

- 

- 

5.08 

5 

+ 

+ 

- 

24.15 

6 

+ 

- 

- 

8.89 

7 

- 

+ 

- 

16.63 

8 

- 

+ 

+ 

46.45 

By  analyzing  the  values  of  regression  coefficients,  one  can  state  that  the  following  are 
statistically  significant:  b0;  bt;  b2;  b3;  b13;  b23  and  b123.  A check  of  lack  of  fit  shows  that  a 
linear  model  is  not  adequate.  In  the  actual  situation,  a change  in  factor-variation  interval 
and  a replication  of  the  experiment  is  acceptable.  By  a decrease  in  factor  variation  X2  and 
X3,  the  values  of  the  associated  regression  coefficients  b2  and  b3  will  be  reduced,  as  well 
as  coefficients  for  interactions:  b13;  b23  and  b123,  to  the  level  when  they  will  become  statis- 
tically insignificant.  There  is  a reason  to  transfer  the  experimental  center  to  the  condi- 
tions of  design  points  No.l  and  No.8.  The  suggested  change  in  variation  interval 
requires  replication  of  the  design  of  experiment  or  8 trials. 

Movement  to  optimum  by  an  inadequate  linear  model  is  also  possible  in  cases 
when  doing  the  mentioned  eight  trials  is  not  acceptable.  The  values  of  linear  regres- 
sion coefficients  are  considerably  above  the  values  of  those  for  interactions,  the 
more  so  since  linear  effects  are  not  aliased/confounded  with  interaction  effects.  Al- 
though the  movement  to  optimum  by  an  inadequate  linear  model  is  mathematically 
incorrect,  it  may  be  accepted  in  practice  with  an  adequate  risk.  Note  that  when  try- 
ing to  optimize  a process  one  should  aspire  towards  both  the  smallest  possible  inter- 
action effects  and  approximate  or  symmetrical  linear  coefficients.  In  problems  of 
interpolation  models,  the  situation  is  exactly  the  opposite  since  it  insists  on  interac- 
tion effects,  which  may  be  significant. 

Interpolation  models-inadequate  linear  model 

The  first  thing  one  must  do  when  searching  for  an  interpolation  model  is  to  include 
interactions  in  the  model.  This  is  possible  when  an  unsaturated  design  of  experi- 
ments is  used.  By  introducing  interactions  there  may  appear  a case  where  the 
degrees  of  freedom  are  insufficient  for  a check  of  lack  of  fit  of  the  model,  and  it  is 
therefore  necessary  to  do  two  to  three  trials  within  the  experimental  region.  All 
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Figure  2.39  Block  diagram  for  obtaining  interpolation  model, 
linear  model  inadequate 


other  approaches  to  obtain  an  interpolation  model  have  to  do  with  realization  of 
new  trials,  by  an  upgrading  to  the  basic  design  of  experiment.  The  same  approaches 
are  used  as  with  removal  of  insignificance  of  regression  coefficients:  upgrading  to 
the  replica  where  aliased/confounded  effects  become  clean,  upgrading  to  the  full 
factorial  experiment,  and  to  the  second-order  design.  Another  unusual  approach  is 
factor  and  response  transformation.  If  by  these  approaches,  an  adequate  model  is 
not  obtained,  what  remains  is  to  split  the  experimental  region  into  a number  of  sub- 
regions  that  will  be  described  by  adequate  models.  This,  of  course,  requires  a reduc- 
tion of  factor  variation  intervals.  Fig.  2.39  shows  the  block  diagram  for  obtaining 
interpolation  model  when  a linear  model  is  inadequate.  When  the  linear  model  is 
adequate,  then  the  research  problem  is  solved. 

Example  2.42  [36] 

An  experiment  with  these  factors  was  done  for  mathematical  modeling  of  an  extrac- 
tion process: 

Xj-diameter  of  turbine  mixer,  mm; 
x2-number  of  rotations  of  mixer,  min'1; 
x3-temperature,  °C; 

x4-free  acid  concentration  in  water  solution,  gE/1; 
x5-height  of  liquid  layer,  mm  and 
x6-ratio  of  phases  in  emulsion. 

The  optimization  parameter  is  the  time  of  complete  decomposition  in  minutes. 
Experimental  conditions,  and  the  matrix  of  design  of  experiments  with  outcomes  of 
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trials  are  shown  in  Table  2.136.  A 1/4-replica  of  the  full  factorial  experiment  26  was 
used.  The  obtained  linear  model  proved  to  be  inadequate.  Therefore,  three  even  in- 
teractions, which  by  their  absolute  values  are  the  biggest  and  are  not  interaliased/ 
confounded,  have  been  included  in  the  model. 

y = 12.16  + 0.53XJ  + 0.53X2  - 1.38X3  - 3.22X4  + 1.44X5 
— 0.62X6  - 0.84XjX4  - O.SOXjXg  - 0.78X2X4 

This  kind  of  a regression  model  is  adequate  and  has  been  used  for  designing  the 
industrial  extractor. 

Table  2.136  Fractional  factorial  experiment  2l  2 


Name 

Xl 

x2 

x3 

x4 

x5 

x6 

Basic  level 

90 

600 

26 

0.40 

195 

0.8115 

Variation 

10 

100 

4 

0.29 

25 

0.0975 

interval 

Upper  level 

100 

700 

30 

0.69 

220 

0.909 

Lower  level 

80 

500 

22 

0.11 

170 

0.714 

Trials 

Factors 

Response 

x. 

x2 

x3 

X4 

x5 

x6 

Y 

1 

- 

+ 

+ 

+ 

- 

- 

7.00 

2 

- 

- 

- 

- 

+ 

- 

16.50 

3 

- 

- 

- 

+ 

- 

- 

9.50 

4 

- 

- 

+ 

+ 

+ 

+ 

9.00 

5 

+ 

+ 

+ 

+ 

+ 

+ 

7.75 

6 

+ 

- 

- 

+ 

+ 

+ 

10.75 

7 

- 

+ 

- 

+ 

+ 

+ 

11.50 

8 

+ 

- 

- 

- 

- 

+ 

13.25 

9 

+ 

+ 

- 

+ 

- 

- 

8.50 

10 

- 

+ 

+ 

- 

+ 

- 

14.00 

11 

- 

- 

+ 

- 

- 

+ 

9.25 

12 

+ 

- 

+ 

- 

+ 

- 

17.25 

13 

+ 

+ 

+ 

- 

- 

+ 

14.50 

14 

+ 

+ 

- 

- 

+ 

- 

22.0 

15 

- 

+ 

- 

- 

- 

+ 

16.25 

16 

+ 

- 

+ 

+ 

- 

- 

7.50 

Summary 

Translation  of  a model  from  the  abstract  mathematical  into  a researcher’s  language 
is  called  model  interpretation.  Interpretation  is  a complex  process  with  several 
phases.  It  includes  the  estimates  of  the  sizes  and  signs  of  the  linear  factor  regression 
coefficients  and  their  interactions,  comparison  of  factor  effects,  check  of  previous- 
preliminary information,  and  in  some  cases  a check  of  the  hypothesis  on  the  mecha- 
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nism  of  the  process.  An  assembly  of  actions  and  experimental  situations  is  reduced 
to  typical  cases,  which  differ  by  lack  of  fit  and  inadequacy  of  the  model,  by  signifi- 
cance and  insignificance  of  regression  coefficients,  by  position  of  the  optimum,  etc. 

For  an  adequate  linear  model  with  statistically  significant  regression  coefficients, 
these  decisions  are  possible:  movement  to  optimum,  second-order  design,  end  of 
research.  If  part  of  a regression  coefficients  is  insignificant,  then  there  exist  several 
activities  for  transforming  them  into  statistically  significant  ones:  change  of  varia- 
tion interval,  shift  of  design  center,  rejection  of  insignificant  factors,  increasing 
number  of  parallel  trials,  upgrading  of  design.  Besides,  a possibility  exists  to  move 
to  the  optimum  along  the  gradient,  and  if  the  optimum  region  is  close  by,  then  an 
upgrading  to  design  towards  a second-order  design  or  termination  of  research  are 
the  other  options. 

Finally,  when  all  regression  coefficients  are  insignificant:  second-order  design  or 
termination  of  research  (optimum  region  close  by)  is  realized,  procedures  are 
applied  for  obtaining  significant  regression  coefficients  (optimum  region  far  off  or 
its  position  is  undefined).  Research  is  terminated  for  an  inadequate  linear  model  if 
the  optimum  region  is  close  by  or  a second-order  design  is  realized.  Change  of  varia- 
tion interval,  shift  of  design  center,  upgrading  design,  movement  along  the  gradient 
are  all  activities  for  any  other  position  of  the  optimum.  Another  possibility  is  to 
include  interaction  effects  in  the  model  and  then  move  towards  the  optimum  by 
means  of  an  incomplete  second-order  model. 

If  the  experimental  objective  is  to  obtain  an  interpolation  model,  an  adequate  lin- 
ear model  is  the  solution.  In  the  case  of  an  inadequate  linear  model,  one  of  the  fol- 
lowing activities  is  undertaken:  inclusion  of  interaction  effects  into  the  model, 
upgrading  the  design,  transformation  of  variables,  change  of  variation  intervals. 

2.3.2 

Second-order  Rotatable  Design  (Box-Wilson  Design) 

Second-order  designs  are  used  in  practice  in  situations  when  the  linear  model  is 
insufficient  for  a mathematical  description  of  a research  subject  with  an  adequate 
precision.  Then  a mathematical  model  in  the  form  of  a second-order  polynomial  is 
formed.  When  describing  a response  surface  by  a second-order  equation,  varying  a 
factor  on  only  two  levels  does  not  offer  the  necessary  information.  Hence,  an  experi- 
ment is  designed  so  that  factors  are  varied  on  three  or  more  levels.  One  of  such 
designs  is  the  second-order  rotatable  design  (Box-Wilson  design).  These  designs  are 
particularly  interesting  for  k=3  and  k=5  in  conditions  of  composite  designing  (upgrad- 
ing of  designs  of  lower  order). 

With  second-order  rotatable  designs,  we  upgrade  a FUFE  design  or  its  fractional 
replica  (usually  half-replica)  to  get  a second-order  design  by  adding  a certain  number 
of  “ starlike/axial/star”  and  “ null/centerpointf  points  to  the  “ core”.  Use  of  a FUFE 
matrix  as  “core”  of  a second-order  rotatable  design  for  k<5,  and  half-replica  for  k>5  is 
recommended.  Starlike  points  are  located  on  coordinate  axes  at  distance  from  the 
experimental  center,  with  taking  into  account  of  the  rotatability  condition  (for  FUFE 
as  “core”)  [30]. 
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a=2k/4  (2.74) 

If  a fractional  replica  of  type  2k'p  has  been  chosen  as  the  “core"  of  rotatable  design, 
another  expression  is  used  : 

a=2(k‘p)/4  (2.75) 

The  rotatability  condition  is  defined  by  these  relations: 

N 

Y,Xiu  = Nx'k2  for  i=1.2,...,k  (2.76) 


N N 

E 4 = 3 E = 3NX4  for  i,  j=1.2,...,k  (2.77) 

l l 

where  k2  and  k4  are  constants.  For  k=2  or  k=4  these  constants  are  linked  by  ratio: 


k4 


k4  _ feC 
X-2  k+2 


(2.78) 


where  C is  determined  by  formula  (2.84). 

Conditions  (2.76)  and  (2.77)  define  independence  of  the  design  from  rotation  of 
coordinates.  When  selecting  the  “null/centerpoints  “ points  (points  in  experimental 
center)  take  into  consideration  a check  of  lack  of  fit  of  the  model,  an  estimate  of 
experimental  error  and  conditions  of  uniformity  [37].  Centerpoints  are  created  by 
setting  all  factors  at  their  midpoints.  In  coded  form,  centerpoints  fall  at  the  all-zero 
level.  The  centerpoints  act  as  a barometer  of  the  variability  in  the  system.  All  the 
necessary  data  for  constructing  the  rotatable  design  matrix  for  k<7  are  in 
Table  2.137.  This  kind  of  designing  is  called  central,  because  all  experimental  points 
are  symmetrical  with  reference  to  the  experimental  center.  This  is  shown  graphically 
for  k=2  and  k=3  in  Fig.  2.40. 


' x2 

Figure  2.40  Distribution  of  rotatable  design  points 


Table  2.137  Construction  of  rotatable  design 
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Number  of 
factors-  k 

Number  of 
“core” 
points  nj 

Number  of 
“starlike" 
points  na 

Number 
of  “null” 
points  n0 

Coded 
values  a 

Total  number 
of  trials-  N 

Notes  on 
design  “core” 

2 

4 

4 

5 

1.414 

13 

- 

3 

8 

6 

6 

1.682 

20 

- 

4 

16 

8 

7 

2.000 

31 

- 

5 

32 

10 

10 

2.378 

52 

- 

5 

16 

10 

6 

2.000 

32 

Half-replica 

6 

64 

12 

15 

2.828 

91 

- 

6 

32 

12 

9 

2.378 

53 

Half-replica 

7 

128 

14 

21 

3.333 

163 

- 

7 

64 

14 

14 

2.828 

92 

Half-replica 

Total  number  of  design  points  N of  a rotatable  design  is  determined  from: 

N = 2k  + 2k  + n0  = Hj  + na  + n0  (2.79) 

Design  matrices  of  central  composite  rotatable  designs  ( CCRD ) for  k=2,  k=3  and  k=5  are 
shown  in  Tables  2.138  - 2.140.  By  using  relation  (2.59),  which  connects  coded  and 
real  factor  values,  we  switch  from  design  matrix  to  operational  matrix,  Table  2.138. 

2 

Table  2.138  Central  composite  rotatable  design  2 + 2x2  + 5 


Number 
of  trials 

Design  matrix 

Operational 

matrix 

Response 

Predicted  values 

Xn 

x2 

xn  % 

x2  % 

Yu 

Yu 

(Yu  - Yu)2 

1 

+ 

+ 

2.4 

100 

50 

49.9 

0.01 

2 

- 

+ 

1.4 

100 

67 

65.2 

3.24 

3 

+ 

- 

2.4 

40 

60 

62.4 

5.76 

4 

- 

- 

1.4 

40 

70 

70.6 

0.36 

5 

-1.414 

0 

1.2 

70 

70 

68.1 

3.61 

6 

+1.414 

0 

2.6 

70 

50 

51.5 

2.25 

7 

0 

-1.414 

1.9 

28 

72 

70.6 

1.96 

8 

0 

+1.414 

1.9 

112 

56 

57.5 

2.25 

9 

0 

0 

1.9 

70 

62 

64.0 

4.00 

10 

0 

0 

1.9 

70 

64 

64.0 

0.00 

11 

0 

0 

1.9 

70 

68 

64.0 

16.00 

12 

0 

0 

1.9 

70 

64 

64.0 

0.00 

13 

0 

0 

1.9 

70 

62 

64.0 

4.00 

13 

13  2 

E Yu  = 815 
1 

E (yu  - Y„)  =43,44 
1 V ' 
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Table  2.139  Central  composite  rotatable  design  23  + 2x3  + 6 


Number  of  trials 

x, 

x2 

X3 

Yu 

1 

+ 

+ 

+ 

20.67 

2 

+ 

+ 

- 

17.32 

3 

+ 

- 

+ 

16.90 

4 

+ 

- 

- 

16.72 

5 

- 

+ 

+ 

15.54 

6 

- 

+ 

- 

15.39 

7 

- 

- 

+ 

15.22 

8 

- 

- 

- 

15.13 

9 

-1.68 

0 

0 

15.19 

10 

1.68 

0 

0 

17.01 

11 

0 

-1.68 

0 

13.96 

12 

0 

1.68 

0 

15.76 

13 

0 

0 

-1.68 

15.48 

14 

0 

0 

1.68 

15.96 

15 

0 

0 

0 

15.97 

16 

0 

0 

0 

16.00 

17 

0 

0 

0 

15.10 

18 

0 

0 

0 

14.90 

19 

0 

0 

0 

14.78 

20 

0 

0 

0 

16.07 

Regression  coefficients  are  calculated  after  constructing  the  operational  matrix  by 
these  equations: 


bn=- 


2 A 


N 


N k 


(K)2(k+2  )Ey,-cxx;EE4xyu 


E XiuXYu 

bj  = — 

‘ N-nn 


? ni 

C 

^ij  E XiuXjUyu 


(2.80) 


(2.81) 

(2.82) 


b 


ii 


AC2 

~N~ 


[(fc  + 2)X4 


N 2 AC 2 * N N 2 

k]E4y»+^(i-^)EE4y» 

i JV  it 


2 AC} 
N 


N 

E Yu 
1 (2.83) 


where: 


C = 


N -A=  1 

N—n0  ’ 2V4[(k+2)X*4-k] 


(2.84) 
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Table  2.140  Central  composite  rotatable  design  25  1 + 2x5  + 6 


No.  of  trials 

Xr 

x2 

X3 

X4 

X5 

Yu 

1 

+ 

+ 

+ 

+ 

+ 

31.1 

2 

- 

+ 

+ 

+ 

- 

30.8 

3 

+ 

- 

+ 

+ 

- 

27.1 

4 

- 

- 

+ 

+ 

+ 

29.1 

5 

+ 

+ 

- 

+ 

- 

21.6 

6 

- 

+ 

- 

+ 

+ 

20.8 

7 

+ 

- 

- 

+ 

+ 

24.8 

8 

- 

- 

- 

+ 

- 

17.8 

9 

+ 

+ 

+ 

- 

- 

27.2 

10 

- 

+ 

+ 

- 

+ 

29.4 

11 

+ 

- 

+ 

- 

+ 

31.1 

12 

- 

- 

+ 

- 

- 

30.8 

13 

+ 

+ 

- 

- 

+ 

23.4 

14 

- 

+ 

- 

- 

- 

22.2 

15 

+ 

- 

- 

- 

- 

23.4 

16 

- 

- 

- 

- 

+ 

22.3 

17 

-2 

0 

0 

0 

0 

23.0 

18 

2 

0 

0 

0 

0 

31.7 

19 

0 

-2 

0 

0 

0 

28.3 

20 

0 

2 

0 

0 

0 

26.8 

21 

0 

0 

-2 

0 

0 

18.8 

22 

0 

0 

2 

0 

0 

35.8 

23 

0 

0 

0 

-2 

0 

25.2 

24 

0 

0 

0 

2 

0 

27.0 

25 

0 

0 

0 

0 

-2 

23.2 

26 

0 

0 

0 

0 

2 

28.5 

27 

0 

0 

0 

0 

0 

28.3 

28 

0 

0 

0 

0 

0 

32.1 

29 

0 

0 

0 

0 

0 

28.0 

30 

0 

0 

0 

0 

0 

29.4 

31 

0 

0 

0 

0 

0 

34.5 

32 

0 

0 

0 

0 

0 

35.1 

When  estimating  the  significance  of  regression  coefficients,  these  equations  are 
used: 


,2  _2Akl(k+2)  2 

\ ~ at 


(2.85) 


si 


N-n0 


(2.86) 
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j.-wt 

(2.87) 

c2  AC2[(k+l)V4-(k~l)}  2 
Sbii  = N y 

(2.88) 

The  hypothesis  on  lack  of  fit  of  second-order  model  is  checked  by  means  of  Eqs. 
(2.172);  (2.173);  (2.133)  and  (2.135). 

Calculation  of  regression  coefficients  may  be  transformed  into  a simpler  form: 

N k N 

bo  = ai  E Yu  - <h  E E x?u  X Yu  (2.89) 

1 1 1 

N 

bi  = ai  T,Xiu  X Yu 
1 

(2.90) 

nj 

bij  = «4  EXiuXjuYu 
1 

(2.91) 

N k N N 

bu  = «5E Xu  xyM  + «6EEE„x  Yu  -a7J2Yu 

(2.92) 

1 11  1 


where:  ai, are  coefficients  as  determined  from  Table  2.141. 

Table  2.141  Coefficients  values  a^  -r-  a7 


Number 
of  factors 
k 

Number 
of  trials 
N 

Coefficients 

31 

a2 

a3 

34 

a5 

a6 

a7 

2 

13 

0.2000 

0.1000 

0.1250 

0.2500 

0.1250 

0.0187 

0.1000 

3 

20 

0.1663 

0.0568 

0.0732 

0.1250 

0.0625 

0.0069 

0.0568 

4 

31 

0.1428 

0.0357 

0.0417 

0.0625 

0.0312 

0.0037 

0.0357 

5* 

32 

0.1591 

0.0341 

0.0417 

0.0625 

0.0312 

0.0028 

0.0341 

5 

52 

0.0988 

0.0191 

0.0231 

0.0312 

0.0156 

0.0015 

0.0191 

6* 

53 

0.1108 

0.0187 

0.0231 

0.0312 

0.0156 

0.0012 

0.0187 

6 

91 

0.0625 

0.0098 

0.0125 

0.0156 

0.0078 

0.0005 

0.0098 

7* 

92 

0.0730 

0.0098 

0.0125 

0.0156 

0.0078 

0.0005 

0.0098 

7 

163 

0.0398 

0.0052 

0.0066 

0.0078 

0.0039 

0.0002 

0.0052 

* With  half-replica 


Equations.  (2.89)  to  (2.92)  have  these  forms  for  k=3: 

20  3 20 

b0  = 0.1663  J^Yu  - 0.0568  £ £ xfuyu 

1 1 1 

20 

bt=  0.0732  ^XiuYu 
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bij  = 0.1250  EXiuXjuYu 
1 

20  3 20  20 

bu  = 0.0625  E Xtyu  + 0.0069  £ E X?yu  - 0.0568  £ yu 
l ii  l 

Example  2.43  [30] 

The  outcome  of  research  in  the  field  of  textile  tissues  is  shown  in  Table  2.138.  The 
research  included  two  factors:  Xrreagent  concentration,  %;  X2-temperature,  °C.  Vari- 
ation intervals  are  shown  in  Table  2.142. 


Table  2.142 

Factor  variation  i 

ntervals 

Factors 

Variation  levels 

Variation  intervals 
Ax 

-1.414 

-1 

0 

+1 

+1.414 

X 

X],  % 

1.2 

1.4 

1.9 

2.4 

2.6 

0.5 

x i , °C 

28 

40 

70 

100 

112 

30 

Factor  values  in  “starlike”  points  have  been  determined  by  relation  (2.59): 

xlZh9  x 1414  = ^i_i^^  12 

1 0.5  ’ 30  0.5  1 

-1.414  = ^ * = 28 

30  2 

By  processing  outcomes,  we  obtain  estimates  of  regression  coefficients  for  the 
second-order  regression  model: 


yu  — T b^X-i  + ii2X2  + 2X2X2  + b-^Xi  + bjjXi 


(2.93) 


Regression  coefficients  are  calculated  by  relations  (2.89)  and  (2.92)  and  by  using 
Table  2.141. 

13  2 6 

b0  = 0.2  E Yu  - 0.1  E E *Iyu  = 0.2  x 815  - 0.1(487  + 503)  = +64.0 
1 1 1 

6 

\ = 0.125  Y xiuYu  = 0.125(50  - 67  + 60  - 70  - 1.414  x 70  + 1.414  x 50) 

l 

= -5.875 
6 

b2  = Y,X2uYu  = 0.125(50  + 67-  70-  1.414  x 72-  60+  1.414  x 56)  = -4.500 


bn  = 0.25  Y xiuXiuYu  = 0-25(50  - 67  - 60  + 70)  = -1.750 


6 2 6 13 

bn  = 0.125  E xlyu  + 0.0187  E E +y«  - o.ioo  E 
1 11  1 

= 0.125(50  + 67  + 60  + 70  + 2 x 70  + 2 x 50)  + 0.0187(487  + 503)  - 0.100  x 815  = 


-2.112 
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6 2 6 13 

b22  = 0.125  £ X2uyu  ± 0.0187  £ £ X-uyu  - 0.100  £ yu 
i ii  i 

= 0.125(50  + 67  + 60  + 70  + 2 x 72  + 2 x 56)  + 0.0187  x 990  - 0.100  x 815  = 0.112 


Hence  Eq.  (2.93)  becomes: 
y = 64.00  - 5.88X,  - 4.50X2  - 1. 75XjX2 


2.11Xi  + O.IIX2 


(2.94) 


Lack  of  fit  of  the  obtained  regression  model,  for  the  case  of  rotatable  designing 
with  trials  replicated  only  in  design  center,  is  checked  by  the  relation  (2.173): 


E (yu-yu)  -£  (yoj-y0) 


43.44-24.00 


/a  1 


13-6-4 


= 6.28 


5-  — 1 


E (yoj-y  0) 


n0-l 


= 6.00 


Fr  = 


- = 1.05 


If  degrees  of  freedom  are  fAD=13-6-4=3;  fE=n0-l=5-l=4  and  l-a=95  %,  FT=6.59  is 
obtained  from  Table  E.  Since  FX>FR,  we  may  consider  the  regression  equation  (2.94) 
adequate.  The  significance  of  regression  coefficients  is  checked  by  the  expressions 
(2.85)  to  (2.88). 

S2bo  = 0.20  x S2y  = 0.20  x 6.0  = 1.20  =>  Si(>  = 1.10 
S2b.  = 0.125  x Sy  = 0.125  x 6.0  = 0.75  =t>  Sb  = 0.87 
Si..  = 0.1438  x Sp  = 0.1438  x 6.0  = 0.86  =>  Sfc..  = 0.93 
S2h..  = 0.25  x Sp  = 0.25  x 6.0  = 1.50  =>  Sb.,  = 1.23 


Ab0  — ±2  x Sho 


±2.20:  A bi  = ±2  x Sb, 


±1.74; 


A bu  = ±2  x Sj,..  = ±1.86;  A \ = ±2  x Sh  = ±2.46. 

ii  J ij 

By  comparing  absolute  values  of  regression  coefficients  with  calculated  intervals, 
we  may  assert  with  95%  confidence  level  that  all  regression  coefficients  except  b]2 
and  b22  are  statistically  significant.  The  regression  equation  becomes: 

y = 64.00  - 5.88Xj  - 4.50X2  - 2.11X?  (2.95) 


Example  2.44 

This  example  shows  a second-order  rotatable  design  for  three  factors  (k=3).  The 
design  matrix,  in  accord  with  Table  2.139  and  Table  2.137  contains  20  design  points 
in  total,  eight  design  points  in  the  design  core  (nj=8)  and  six  replicate  design  points 
in  the  design  center  (n0=6).  Intervals  and  levels  of  variation  of  the  three  factors  are 
shown  in  Table  2.143. 
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Table  2.143  Factor  variation  intervals 


Factors 

Variation  levels 

Variation  intervals 

-1.682 

-1 

0 

+1 

+1.682 

Ax 

Xj-temperature,  °C 

130 

140 

155 

170 

180 

15 

X2-pressure,  kp/cm2 

3.2 

10 

20 

30 

36.8 

10 

X3-time,  h 

10 

30 

60 

90 

110 

30 

The  objective  of  the  research  has  been  the  optimization  of  adhesion  process  of 
thermoplastic  polymer  on  pressing  at  a higher  temperature.  Tensile  strength  (yu) 
has  been  analyzed  as  the  optimization  criterion.  The  linear  model  obtained  after  the 
first  eight  design  points  was  inadequate.  FUFF  design  has  therefore  been  upgraded 
to  a second-order  rotatable  design.  Regression  coefficients  were  determined  from 
experimental  data  for  the  next  regression  equation: 

Yu  = £>()  + b3X3  + b2X2  T £>3X3  + by2X3X2  T £>13X3X3  + b23X2X3  T £>33X1 

+b22X2  + £>33X3 

Regression  coefficients  are  determined  from  relations  (2.89)  to  (2.92): 

b0=0.1663  x 319.01-0.0568(221.02+217.00+221.86)=+15.47; 

10 

\ = 0.0732  Xluyu  = 0.0732(20.67  + 17.32  + 16.90  + 16.72  - 15.54  - 15.39 
1 

- 15.22  - 15.13  - 1.683  x 15.19  + 1.683  x 17.01)  = +0.980; 
10 

b2  = 0.0732  £X2uy„  =+0.584 
1 

10 

b3  = 0.0732  £X3uyu  =+0.326 
1 
8 

b12  = 0.125  EX1„X2uy„  = 0.125(20.67  + 17.32  - 16.90  - 16.72  - 15.54  - 15.39 
1 

+15.22  + 15.13)  = +0.474; 

8 

£>13  =0.125  EX1bX3J»  = +0-411; 

1 

8 

£t23  = 0.125  EX2uX3uyu  = +0.404; 

1 

10  3 10  20 

£>„  = 0.0625  £ Xtyu  + 0.0069  £ £ X? uyu  - 0.0568  £ yu 
1 11  1 

= 0.0625(132.89  + 2.83  x 15.19  + 2.83  x 17.01)  + 0.0069(221.01  + 217.00 
+221.86)  - 0.0568  x 319.01  = +0.248; 
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10 

b22  = 0.0625  E Xi Lyu  - 13.566  = 0.0625  x 217.00  - 13.566  = -0.003; 

1 

10 

b33  = 0.0625  EE luYu  - 13.566  = 0.0625  x 221.86  - 13.566  = +0.300 

l 

Regression  equation  is  as  follows: 

V = 15.470  + 0.980X,  + 0.584X,  + 0.326X,  + 0.474X,  X2  + 0.41  IX,  X, 

, , , (2.96) 

+ 0.404X2X3  + 0.248Xx  - 0.003X2  + 0.3X3 

The  statistical  significance  of  regression  coefficients  in  Eq.  (2.96)  is  checked  by 
reproducibility  variance  (Sy  = 0.32).  By  using  expressions  (2.155)  to  (2.158)  we  get; 


A b0  = ±0.816; 
A h{  = ±0.542; 
A by  = ±0.708; 
A bu  = ±0.526; 


Sy  = ±0.462; 
S?  = ±0.307; 
Sy  = ±0.401; 
Sy  = ±0.298. 


By  comparing  absolute  values  of  regression  coefficients  with  errors  in  their  esti- 
mates, it  becomes  evident  that  all  regression  coefficients  are  statistically  significant 
with  0.95%  confidence,  except  for  bn  and  b22.  A check  of  lack  of  fit  of  the  obtained 
regression  model  proved  that  it  is  adequate  with  95%  confidence  (FR<FT). 


Example  2.45 

The  experiment  consisted  of  32  design  points,  six  design  points  replicated  in  the 
design  center  and  16  half-replica  design  points  of  type  25'1.  The  design  of  experi- 
ments matrix  with  the  outcomes  of  experiments  is  shown  in  Table  2.140.  The  sec- 
ond-order regression  model  has  this  form; 

Yu  — b(f  + b3X i + b2X2  + b3X3  + b4X4  + b3X3  + b32X3X2  + b33X3X3  + b23X2X3 
+buX lX4  + fe24X2X4  + b34X3X4  + fajjXjXj  + b25X2X5  + fo35X3X5 
-\-b43X4X3  + bnXi  + b22  X2  + b33X3  + b44X4  + b33X$ 

Regression  coefficients  are  calculated  thus: 

32  5 18 

b0  = 0.1591  E Yu~  0.0341  E E xlYu  = ±31.219; 

1 1 1 

18 

h=  0.0417  EXluy„  =+0.996; 
i 

b2  = -0.121;  b3  = +3.929; 

b4  = -0.129;  b5  = +0.904; 

16 

bv2  = 0.0625  E XluX2uyu  = -0.394; 
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b 13 

b2i 

b34 

fan 

b22 


-0.856; 

fal4 

= ±0.356; 

fal5  = 

= ±0.694; 

±0.044; 

fa24 

= ±0.001; 

fa25  = 

= -0.331; 

±0.370; 

fa35 

= -0.094; 

fa45  = 

= ±0.369; 

18 

5 

18 

32 

0.0312  ]TXi2 
1 

*Yu 

+ 0.0028  ZZXuYu 

1 1 

- 0.0341 

E Yu 
1 

-0.907;  b33  = -0.969;  ^ = -1.269;  fc55  = -1.332. 


Thus,  this  regression  equation  has  been  obtained: 

yu  = 31.22  + l.OOXj  - 0.12X2  + 3.93X5  - 0.13X4  - 0.90X5  - 0.96X2 
- 0.91X2  - 0.97X2  - 1.27X2  - 1.33X2  - 0.39XxX2  - 0.86X3X3 
+ 0.36XxX4  + 0.69X3X5  + 0.04X2X3  + 0.001X2X4  - 0.33X2X5 
+ 0.37X3X4  - 0.09X3X5  + O.37X4X5 

A check  of  statistical  significance  of  regression  coefficients  for  reproducibility  var- 
iance Sy =9.76  has  given  these  values: 

S2bo  = 0.1591  x Sy  = 0.1591  x 9.76  = 1.553; 

S2b.  = 0.0417  x 9.76  = 0.409;  sf  = 0.0625  x 9.76  = 0.610;  sf..  = 0.0341  x 9.76  = 0.323 
so  that: 


A b0  = ±2  Sh(>  = ±2.50;  A bt  = ±2  Sb.  = ±1.28; 

A by  = ±2  Sb  = ±1.56;  A bu  = ±2  Shjj  = ±1.14. 

By  comparing  absolute  values  of  regression  coefficients  with  their  interval  esti- 
mates, with  95%  confidence,  these  regression  coefficients  are  statistically  signifi- 
cant: b0,  b3,  b44  and  b55.  Following  this  the  regression  model  (2.97)  becomes: 

yu  = 31.22  ± 3.93X3  - 1.27X4  - 1.33X52  (2.98) 

A check  of  lack  of  fit  of  this  model  shows  that  it  is  adequate  with  95%  confidence. 

Example  2.46  [38] 

We  should  experimentally  establish  the  effect  of  ingredients  in  extracted  phosphor- 
ous acid  to  the  degree  of  decomposition  of  flotational  concentration  of  phosphorite 
(y)  and  the  maximized  decomposition  rate  (ymax).  Significant  factors  for  decomposi- 
tion are:  xrtemperature  of  process  C;  X2  - X5  percentage  content  of  ingredients  in 
phosphorous  acid  MgO,  SO3,  A1203  and  F %,  respectively.  Basic  levels,  variation 
intervals  and  limits  of  factor  space  are  given  in  Table  2.144.  Factor  space  corresponds 
to  the  scope  of  changes  in  concentration  of  ingredients  in  industrially  extracted 
phosphorous  acid.  Extrapolation  outside  limits  given  in  Table  2.144  when  determin- 
ing ymax,  has  therefore  no  sense. 
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Table  2.144  Factor  variation  intervals 


Factors 

Variation  levels 
-2  0 

+2 

Variation  intervals 
Ax 

Xj -temperature,  °C 

10 

50 

90 

20 

x2-MgO,  % 

0.3 

2.1 

3.9 

0.9 

X3-SO3,  % 

0.0 

2.0 

4.0 

1.0 

X4-A1203,  % 

0.59 

1.33 

2.07 

0.37 

x5-F,  % 

0.25 

0.75 

1.25 

0.25 

To  obtain  the  second-order  regression  model,  second-order  CCRD  has  been  used. 
The  number  of  design  points  (trials)  for  k=5  was  32.  The  design  core  has  corre- 
sponded to  half-replica  25'1  with  this  generating  ratio  X5=X1X2X3X4.  The  value  and 
number  of  design  points  in  the  experimental  center  n0=6  are  determined  from 
Table  2.137.  The  design  of  experiments  with  outcomes  is  shown  in  Table  2.145. 


Table  2.145  CCRD  25  1 + 2 x 5 + 6 


No. 

x. 

X2 

X3 

X4 

X5 

Yu 

No. 

x. 

x2 

X3 

X4 

X5 

Yu 

1 

+ 

+ 

+ 

+ 

+ 

34.7 

17 

-2 

0 

0 

0 

0 

25.0 

2 

- 

+ 

+ 

+ 

- 

40.0 

18 

+2 

0 

0 

0 

0 

33.3 

3 

+ 

- 

+ 

+ 

- 

39.0 

19 

0 

-2 

0 

0 

0 

49.2 

4 

- 

- 

+ 

+ 

+ 

39.2 

20 

0 

+2 

0 

0 

0 

42.0 

5 

+ 

+ 

- 

+ 

- 

26.6 

21 

0 

0 

-2 

0 

0 

17.5 

6 

- 

+ 

- 

+ 

+ 

29.5 

22 

0 

0 

+2 

0 

0 

41.0 

7 

+ 

- 

- 

+ 

+ 

30.0 

23 

0 

0 

0 

-2 

0 

35.6 

8 

- 

- 

- 

+ 

- 

34.5 

24 

0 

0 

0 

+2 

0 

27.2 

9 

+ 

+ 

+ 

- 

- 

32.2 

25 

0 

0 

0 

0 

-2 

39.0 

10 

- 

+ 

+ 

- 

+ 

41.4 

26 

0 

0 

0 

0 

+2 

30.0 

11 

+ 

- 

+ 

- 

+ 

33.7 

27 

0 

0 

0 

0 

0 

35.4 

12 

- 

- 

+ 

- 

- 

40.9 

28 

0 

0 

0 

0 

0 

36.4 

13 

+ 

+ 

- 

- 

+ 

23.9 

29 

0 

0 

0 

0 

0 

33.2 

14 

- 

+ 

- 

- 

- 

33.3 

30 

0 

0 

0 

0 

0 

32.4 

15 

+ 

- 

- 

- 

- 

27.7 

31 

0 

0 

0 

0 

0 

37.7 

16 

- 

- 

- 

- 

+ 

35.9 

32 

0 

0 

0 

0 

0 

36.9 

Based  on  the  outcomes  in  the  experimental  center,  these  reproducibility  variances 
have  been  determined:  Sy  =4.466  with  this  degree  of  freedom  f=n0-l=5.  Regression 
coefficients  have  these  values: 


bo=34.41; 

b^l.08; 

b2=-0.15; 

b3=4.51; 

b4=-0.54; 

b5 — 1.30; 

b12=0.15; 

b13=0.26; 

b14=1.61; 

b15=0.05 

b23=0-74; 

b24=-0.20; 

b25=0.40; 

b34=0.40; 

b35=0.26; 

b45=0.93; 

bn— 15; 

b22=2.66; 

b33— 1-47; 

b44=-0.93; 
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By  rejecting  statistically  insignificant  regression  coefficients,  we  get  this  regres- 
sion model: 

yu  = 34.4  + 4.51X3  - I.3X5  - 1.5Xi  + 2.66X2  - 1 .47X3  + 1.61X3X4 

A check  of  lack  of  fit  shows  that  the  obtained  model  is  adequate  with  95%  confi- 
dence. This  regression  model  facilitates  determining  the  degree  of  decomposition  of 
the  observed  raw  material  at  different  temperatures  depending  on  changes  in  con- 
tents of  ingredients  in  the  acid.  To  obtain  the  maximal  degree  of  decomposition 
Ymaxi  h is  necessary  to  set  factors  X2  and  X5  to  these  values:  X2=+2  and  X5=-2,  which 
is  evident  from  the  regression  model.  The  effect  of  S03  (X3)  content  in  phosphorous 
acid  is  defined  in  the  regression  model  by  positive  linear  and  negative  square  regres- 
sion coefficients.  Optimal  contents  of  this  ingredient  are  determined  from  the  max- 
imal value  of  y by  X3  and  it  is  1.533%.  The  regression  model  for  X2,  X3  and  X5  has 
the  form: 

^ = 52.12-1.5X1+1.61X3X4  (2.99) 

To  determine  optimal  values  of  temperature  and  contents  of  ingredient  A1203 
(X4),  it  is  necessary  to  transfer  regression  (2.99)  into  canonical  form. 

Example  2.47  [18] 

In  Example  2.26,  we  have  obtained  the  linear  regression  model  for  dynamic  viscosity 
y,  P,  as  a function  of  mixing  speed  Xi,  min'1  and  mixing  time  X2,  min  of  composite 
rocket  propellant.  To  determine  the  conditions  of  minimal  viscosity,  a method  of 
steepest  ascent  has  been  applied.  This  method  has  defined  the  local  optimum  region 
that  has  to  be  described  by  a second-order  model.  Conditions  of  the  factor  variations 
are  shown  in  Table  2.146. 

Table  2.146  Factor  variation  intervals 


Factors 

Variation  intervals 

Ax 

-1.414 

-1.0 

0 

+1.0 

+1.414 

Xj-mixing  speed,  min’1 

37.72 

40.00 

60.00 

80.00 

88.28 

20 

X2-mixing  time,  min 

11.02 

40.00 

110.00 

180.00 

208.98 

70 

The  central  composite  rotatable  design  of  experiment  with  outcomes  is  shown  in 
Table  2.147. 

The  reproducibility  variance  from  five  replicated  design  points  has  the  value: 
=60.93  or  Sf=7.81. 

The  regression  coefficients  are  determined  from  these  relations: 

13  2 6 

b0  = 0.2  £ yu  - 0.1  £ £ xfuyu  = 529.12; 

1 1 1 

6 

h = 0.125  E^i«Y«  = -66.40; 
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Table  2.147  CCRD22+2x2  + 5 


No. 

Design  matrix 

Working  matrix 

Response 

Predicted  values 

x, 

x2 

Xl 

X2 

Yu 

Yu 

Yu  ~ Yu 

1 

+ 

+ 

80 

180 

499.2 

484.7 

14.50 

210.26 

2 

- 

+ 

40 

180 

640.0 

637.5 

2.50 

6.25 

3 

+ 

- 

80 

40 

635.2 

643.5 

-8.30 

68.89 

4 

- 

- 

40 

40 

736.0 

756.3 

-20.30 

412.09 

5 

-1.414 

0 

31.72 

110 

688.0 

674.4 

13.60 

184.96 

6 

+1.414 

0 

88.28 

110 

483.2 

486.6 

-3.40 

11.56 

7 

0 

-1.414 

60 

11.02 

800.0 

778.0 

21.40 

457.96 

8 

0 

+1.414 

60 

208.98 

571.2 

582.0 

-11.00 

121.00 

9 

0 

0 

60 

110 

528.0 

529.12 

-1.12 

1.23 

10 

0 

0 

60 

110 

540.8 

529.12 

11.68 

136.66 

11 

0 

0 

60 

110 

520.0 

529.12 

-9.12 

82.99 

12 

0 

0 

60 

110 

531.2 

529.12 

2.08 

4.37 

13 

0 

0 

60 

110 

524.8 

529.12 

-4.32 

18.58 

1716.80 

b2  - 

6 

= 0.125  £ X2 
1 

uYu 

-69.44; 

b 12 

6 

= 0.25£XluX2uyu 

- -10.00; 

6 

2 

6 

13 

bu 

= 0.125  £Xluyu± 

0.0187  £ ; 

EXiuYu 

-o.i£y. 

, = 25. 

.70; 

1 

1 

1 

1 

6 

2 

6 

13 

bi2 

= 0.125  £X2uyu± 

0.0187  £ ; 

52x?uYu 

— 0.l£y, 

, = 75. 

.67; 

1 

1 

1 

1 

The 

significance 

check 

of  regression  coefficients: 

slo  = Q.2S]  = 12.19;  SK  = 3.49;  Afo0  = ±2Sfco  = ±6.98; 

S2b.  = 0.125Sp  = 7.62;  Sb  = 2.76;  A bt  = ±2 Sh  = ±5.52; 

S2b..  = 0.1438S^  = 8.76;  Sb..  = 2.96;  A ba  = ±2 Sh  = ±5.92; 

S2h  = 0.250S2  = 15.24;  Sh  = 3.90;  Abu  = ±2S,  =±7.80. 

9 1 “ij  9 

All  regression  coefficients  are  significant  with  95%  confidence.  The  second-order 
regression  model  has  the  form: 

yu  = 529.12  - 66.40Xj  - 69.44X2  - IO.OOXjX;,  ± 25.70Xx2  ± 75.67X2  (2.100) 
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Example  2.48  [39] 

To  optimize  the  process  of  isomerization  of  sulphanylamide  from  Problem  2.6,  a 
screening  experiment  has  been  done  by  the  random  balance  method.  Factors  X,,  X2 
and  X3  have  been  selected  for  this  experiment.  Optimization  of  the  process  is  done 
by  the  given  three  factors  at  fixed  values  of  other  factors.  To  obtain  the  second-order 
model,  a central  composite  rotatable  design  has  been  set  up.  Factor-variation  levels 
are  shown  in  Table  2.148.  The  design  of  the  experiment  and  the  outcomes  of  design 
points  are  in  Table  2.149. 

Table  2.148  Factor  variation  intervals 


Factors 


Variation  levels 


-1.682 

-1 

0 

1 

1.682 

Xl 

0.2 

0.3 

0.5 

0.7 

0.8 

x2 

150 

154 

160 

166 

170 

x3 

30 

40 

55 

70 

80 

Table  2.149 

CCRD23 

+2x3+6 

No. 

Design  matrix 

Response 

No. 

Design  matrix 

Response 

Xr 

X2 

x3 

yu 

x3 

x2 

x3 

yu 

1 

+ 

+ 

+ 

76.00 

11 

0 

-1.682 

0 

71.87 

2 

+ 

- 

+ 

74.05 

12 

0 

+1.682 

0 

77.82 

3 

- 

- 

+ 

80.90 

13 

0 

0 

-1.682 

72.26 

4 

- 

- 

- 

73.00 

14 

0 

0 

+1.682 

79.07 

5 

+ 

+ 

- 

76.81 

15 

0 

0 

0 

77.30 

6 

+ 

- 

- 

62.65 

16 

0 

0 

0 

72.80 

7 

- 

+ 

- 

81.40 

17 

0 

0 

0 

77.90 

8 

- 

+ 

+ 

82.40 

18 

0 

0 

0 

78.40 

9 

-1.682 

0 

0 

84.75 

19 

0 

0 

0 

77.30 

10 

+1.682 

0 

0 

72.42 

20 

0 

0 

0 

77.70 

An  adequate  regression  model  has  been  obtained  by  statistical  processing  of  out- 
comes: 


yu  = 76.89  - 2.58Xj  + 2.64X2  + 2.26X3  + 0.78XxX2 
+0.21XxX3  - 2.39X2X3  + 0.52X2  - 0.81X2  - 0.51X3 


(2.101) 


Example  2.49  [16] 

A second  step  optimization  of  synthesis  of  methacrylic  acid  has  been  done  in  lab 
conditions.  Optimization  has  been  performed  by  these  factors:  xrtemperature,  °C; 
x2-concentration  of  outlet  a-oxy  iso  fatty  acid  in  water  solution,  % and  x3-volume 
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flow  of  solution,  1/h.  Two  optimization  parameters  have  been  analyzed:  yr 
methacrylic  acid  yield  from  inserted  a-oxy  iso  fatty  acid  and  y2-methacrylic  yield  as 
compared  with  dissolved  a-oxy  iso  fatty  acid  (both  responses  are  in  per  cent  from 
theoretical  yield).  The  basic  experiment  has  been  done  according  to  FUFE  design,  as 
shown  in  Table  2.150. 

Table  2.150  FUFE  23 


Factors 

*1 

x2 

x3 

yi 

y 2 

+ 

270 

30 

0.6 

- 

- 

0 

255 

20 

0.5 

82.1 

85.4 

- 

240 

10 

0.4 

- 

- 

No. 

x3 

X2 

x3 

1 

- 

- 

- 

53.0 

61.5 

2 

+ 

- 

- 

65.3 

69.0 

3 

- 

+ 

- 

76.1 

87.6 

4 

+ 

+ 

- 

77.0 

84.5 

5 

- 

- 

+ 

72.7 

72.7 

6 

+ 

- 

+ 

56.1 

70.5 

7 

- 

+ 

+ 

81.0 

82.7 

8 

+ 

+ 

+ 

74.0 

87.0 

Table  2.151 

Additional  design  points  for  CCRD  23 

+ 2x3  + 6 

No. 

X3 

X2 

X3 

yi 

Yi 

1 

-1.682 

0 

0 

70.0 

78.1 

2 

+1.682 

0 

0 

72.1 

79.3 

3 

0 

-1.682 

0 

49.1 

49.2 

4 

0 

+1.682 

0 

74.8 

81.4 

5 

0 

0 

-1.682 

79.7 

81.2 

6 

0 

0 

+1.682 

83.5 

90.6 

7 

0 

0 

0 

82.0 

86.8 

8 

0 

0 

0 

82.9 

86.2 

9 

0 

0 

0 

83.6 

84.9 

10 

0 

0 

0 

82.6 

85.9 

11 

0 

0 

0 

83.1 

86.6 

Statistical  data  analysis  has  offered  these  regression  coefficient  values: 
foryp  fory2: 

b0=69.41  bj-1.26;  b2=7.62;  b3=1.56;  b0=76.92;  b^O.81;  b2=8.50;  b3=1.28; 
bi2=-0.21;  b13=-4.70;  b23=-0.94j  b323=2.60.  b32=-0.51j  bi3=-0.28;  b23=-1.88j  b323=2.13. 
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To  calculate  the  variance  of  reproducibility,  all  design  points  have  been  replicated 
once.  The  sequence  of  doing  the  design  points  by  FUFE  design  has  been  completely 
random.  A check  of  significance  of  regression  coefficients  has  shown  that,  apart 
from  linear,  interaction  regression  coefficients  are  also  significant.  The  difference 
between  associated  free  members  (b0)  and  response  in  the  design  center:  Ayx=82.1- 
69.4=12.7;  Ay2=85. 4-76. 9=8.5,  indicates  that  FUFE  has  been  set  in  factor  space  with 
a high  curvature  of  the  response  surface.  FUFE  has  therefore  been  upgraded  to  a 
second-order  rotatable  design,  with  additional  design  points,  as  shown  in  Table 
2.151. 

One  design  point  (6-th)  in  the  design  center  is  missing  from  the  additional  design 
points.  This  design  point  is  in  the  FUFE  design  matrix. 

These  regression  models  were  obtained  by  statistical  analysis: 

y\  = 83.70  - 0.49Xj  + 7.63X2  + 1.38X3  - 0.21X1X2 

— 4.71XxX3  - 0.94X2X3  - 4.41Xi  - 7.60X22  - 0.50X3 


y2  = 87.20  + 0.64Xx  + 8.25X2  + 1.89X3  - 0.51X1X2 

— 0.28XxX3  - 1.88X2X3  - 2.40X2  - 7.42X2  - 0.24X3 


(2.103) 


Example  2.50  [16] 

A process  having  properties  dependent  on  four  factors  has  been  tested.  A full  factor- 
ial experiment  and  optimization  by  the  method  of  steepest  ascent  have  brought 
about  the  experiment  in  factor  space  where  only  two  factors  are  significant  and 
where  an  inadequate  linear  model  has  been  obtained.  To  analyze  the  given  factor 
space  in  detail,  a central  composite  rotatable  design  has  been  set  up,  as  shown  in 
Table  2.152. 
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Table  2.1 52  CCRD22  + 2x2  + 5 


Variation  levels 

Null  level 
Upper  level 
Lower  level 
+1.41 
-1.41 

Factors 

Regression  coefficients 

*1 

9.20 

10.00 

8.40 

10.33 

8.07 

x2 

4.89 

6.89 

2.89 
7.71 
2.07 

b0=85.14;  b1= 
b12=3.00;  b12= 

yu  = 85,14+  3, 43Xj  - 1,32X2 

:3.43;  b2=-1.32; 

2.60;  b22=-1.19; 

+ 3, 00XjX2  +2.60X2  - 

- 1, 19xf 

Design  matrix 

Response 

No. 

Xo 

x, 

X2 

X? 

X2 

x,x2 

Yu 

Yu 

(Yv  - Yu)2 

1 

+ 

- 

- 

+ 

+ 

+ 

87.1 

87.44 

0.1156 

2 

+ 

- 

+ 

+ 

+ 

- 

79.0 

78.80 

0.0400 

3 

+ 

+ 

- 

+ 

+ 

- 

88.9 

88.30 

0.3600 

4 

+ 

+ 

+ 

+ 

+ 

+ 

92.8 

91.66 

1.2986 

5 

+ 

-1.41 

0 

2.0 

0 

+ 

85.6 

85.50 

0.0100 

6 

+ 

+1.41 

0 

2.0 

0 

0 

94.0 

95.18 

1.3924 

7 

+ 

0 

-1.41 

0 

2.0 

0 

84.5 

84.62 

0.0144 

8 

+ 

0 

+1.41 

0 

2.0 

0 

80.0 

80.90 

0.8100 

9 

+ 

0 

0 

0 

0 

0 

83.7 

85.14 

2.0736 

10 

+ 

0 

0 

0 

0 

0 

86.0 

85.14 

0.7386 

11 

+ 

0 

0 

0 

0 

0 

85.8 

85.14 

0.4356 

12 

+ 

0 

0 

0 

0 

0 

83.9 

85.14 

1.5376 

13 

+ 

0 

0 

0 

0 

0 

86.3 

85.14 

1.3456 

Problem  2.21  [40] 

When  researching  the  process  of  zirconium  extraction  by  tributyl- 
phosphate  from  nitric  acid  solution,  two  factors,  Xi  and  X2  have 
been  analyzed.  The  observed  factors  and  response  are  shown  by 
these  relations: 


xi  = i°g2xH  - 15;x2  = 2(log2T  + 2-5); y = i°g2 
where: 


XH  is  balancing  concentration  of  hydrogen  ion; 

T is  concentration  of  free  tributylphosphate; 

D is  coefficient  of  separation. 

Previous  research  has  indicated  that  for  mathematical  modeling 
of  the  process  it  is  necessary  to  use  second-order  rotatable  design. 
The  design  matrix  with  experimental  outcomes  are  given  in  Table 
2.153. 
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Table  2.1 53  CCRD22  + 2x2  + 4 


No. 

X0 

X, 

x2 

xn2 

x22 

X,X2 

yui 

Yu2 

yu3 

1 

+ 

- 

- 

+ 

+ 

+ 

-7.94520 

-7.95220 

-7.94870 

2 

+ 

- 

+ 

+ 

+ 

- 

-6.25480 

-6.56050 

-6.40770 

3 

+ 

+ 

- 

+ 

+ 

- 

-0.29170 

-0.21430 

-0.25300 

4 

+ 

+ 

+ 

+ 

+ 

+ 

+1.36740 

+1.21430 

+1.29085 

5 

+ 

-1.41 

0 

+2 

0 

0 

-7.18830 

-7.76480 

-7.47665 

6 

+ 

+1.41 

0 

+2 

0 

0 

+2.05320 

+2.05680 

+2.05500 

7 

+ 

0 

-1.41 

0 

+2 

0 

-5.28840 

-5.23290 

-5.26065 

8 

+ 

0 

+1.41 

0 

+2 

0 

-3.05910 

-3.03520 

-3.07715 

9 

+ 

0 

0 

0 

0 

0 

-4.33690 

-4.19000 

-4.26345 

10 

+ 

0 

0 

0 

0 

0 

-4.18740 

-4.28800 

-4.23770 

11 

+ 

0 

0 

0 

0 

0 

-4.19530 

-4.19270 

-4.19400 

12 

+ 

0 

0 

0 

0 

0 

-4.19530 

-4.19270 

-4.19400 

Problem  2.22  [41] 

The  research  subject  in  the  given  problem  is  the  process  of  cementa- 
tion based  on  squeezing  out  mercury  from  salt-acidic  solution  by 
means  of  a less  useful  metal,  such  as  aluminum.  A study  of  kinetics 
of  the  given  chemical  reaction  shows  that  this  process  may  be  effec- 
tively conducted  in  a continuous  chemical  reactor.  Process  efficiency 
is  measured  by  mercury  concentration  in  the  solution  after  refine- 
ment. This  is  simultaneously  the  system  response  as  it  may  be  mea- 
sured quite  accurately  and  quantitatively.  These  three  factors  influ- 
ence the  cementation  process  significantly:  Xi-temperature  of  solu- 
tion, °C;  X2-solution  flow  rate  in  reactor,  ml/1  and  X3-quantity  of 
aluminum  g.  The  factor  space  is  defined  by  these  intervals: 
50<Xi<100;  300<X2<3000;  6<X3<16. 

These  values  have  been  chosen  for  the  design  center  and  varia- 
tion intervals: 


Xio=80;  Ax3=10;  x20=750;  Ax2=300;  x30=12.66;  Ax3=2. 

Values  for  all  variation  levels  are  shown  in  Table  2.154.  Select 
FUFE  23  as  a basic  design  of  experiment.  Determine  the  linear 
regression  model  from  experimental  outcomes,  Table  2.155. 
Assume  that  the  obtained  linear  model  is  inadequate  and  that  there 
is  curvature  of  the  response  surface.  To  check  these  assumptions, 
additional  design  points  were  done  in  the  experimental  center  so 
that  their  average  is  y0 =0.1097  (y0  —estimate  of  free  member  in  lin- 
ear regression,  i.e.  y0  — > PQ ).  Since  b0  — y0  = is  the  measure 
of  the  response  surface  curvature,  check  the  value  of  the  mentioned 
difference.  To  model  the  curved  response  surface,  upgrade  FUFE  to 
CCRD  and  determine  regression  coefficients  of  the  second-order 
model.  The  design  point  outcomes  are  in  Table  2.155. 
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I Table  2.154  Factor-variation  intervals 


Factors 

Variation  levels 

Ax 

-1.682 

- 

0 

+ 

+1.682 

Xi -temperature  of  solution 

69.18 

70 

80 

90 

96.82 

10 

X2-flow  of  solution 

145.4 

450 

750 

1050 

1254.6 

300 

X3 -quantity  of  aluminum 

9.3 

10.66 

12.66 

14.66 

16.02 

2 

Table  2.1 55  CCRD  23  + 2 x 3 + 6 


No. 

Xo 

Design  matrix 

Response 

x. 

X2 

X3 

Yu 

1 

+ 

+ 

+ 

+ 

0.1082 

2 

+ 

- 

+ 

+ 

0.2940 

3 

+ 

+ 

- 

+ 

0.0956 

4 

+ 

- 

- 

+ 

0.1034 

5 

+ 

+ 

+ 

- 

0.3855 

6 

+ 

- 

+ 

- 

0.7045 

7 

+ 

+ 

- 

- 

0.2761 

8 

+ 

- 

- 

- 

0.4271 

9 

+ 

+1.682 

0 

0 

0.0783 

10 

+ 

-1.682 

0 

0 

0.3464 

11 

+ 

0 

+1.682 

0 

0.3321 

12 

+ 

0 

-1.682 

0 

0.0714 

13 

+ 

0 

0 

+1.682 

0.2094 

14 

+ 

0 

0 

-1.682 

0.7048 

15 

+ 

0 

0 

0 

0.1224 

16 

+ 

0 

0 

0 

0.1382 

17 

+ 

0 

0 

0 

0.1204 

18 

+ 

0 

0 

0 

0.0943 

19 

+ 

0 

0 

0 

0.0698 

20 

+ 

0 

0 

0 

0.1135 
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Problem  2.23  [42] 

In  the  study  of  a sort  of  textile  fabric,  its  permeability  to  air  has  been 
tested.  The  research  objective  was  to  obtain  a mathematical  model 
for  air  permeability; 

y=  f(Xn  X2,  X3) 

where: 


y l/m2s  is  air  permeability  through  textile  fabric; 

Xjg/m2  is  textile  fabric  homogeneity; 

X2g/cm3  is  volume  weight  of  material; 

X3  mm  W.C.  is  air  pressure  drop. 

The  experiment  has  been  done  by  CCRD  design.  Conditions  of 
factors  variations  are  given  in  Table  2.156,  and  experimental  out- 
comes in  Table  2.157.  Determine  regression  coefficients  in  the  sec- 
ond-order model. 


Table  2.1 56  Factor  variation  intervals 


Factors 

Variation  levels 

Ax 

-1.68 

-1.00 

0 

+1.00 

+1.68 

Xi-permeability 

92 

120 

160 

200 

227 

40 

x2-homogeneity 

0.10 

0.18 

0.30 

0.42 

0.50 

0.12 

x3-pressure  drop 

2 

4 

7 

10 

12 

3 

Table  2.1 57  CCRD  23  + 2 x 

3 + 6 

No. 

Design  matrix 

y 

No. 

Design  matrix 

y 

X, 

X2 

x3 

X, 

x2 

x3 

1 

+ 

+ 

+ 

151 

11 

0 

-1.68  0 

844 

2 

+ 

+ 

- 

70 

12 

0 

+1.68  0 

180 

3 

+ 

- 

+ 

626 

13 

0 

0 

-1.68 

140 

4 

+ 

- 

- 

330 

14 

0 

0 

+1.68 

576 

5 

- 

+ 

+ 

507 

15 

0 

0 

0 

369 

6 

- 

+ 

- 

250 

16 

0 

0 

0 

352 

7 

- 

- 

+ 

1000 

17 

0 

0 

0 

354 

8 

- 

- 

- 

540 

18 

0 

0 

0 

335 

9 

-1.68 

0 

0 

740 

19 

0 

0 

0 

357 

10 

+1.68 

0 

0 

227 

20 

0 

0 

0 
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Problem  2.24  [43] 

The  problem  in  this  study  consisted  of  obtaining  experimental 
results  on  the  kinetics  of  the  drying  process  of  granulated  silicon 
dioxide  in  a mathematical  process  modeling.  The  drying  process  of 
solid  capillary  porous  materials  is  theoretically  the  most  studied 
one.  Of  course,  by  keeping  in  mind  the  condition  that  coefficients  of 
heat  transfer  are  constant.  Data  from  literature  on  granule  drying 
with  80%  to  5%  moisture,  however,  indicate  considerable  changes 
in  transfer  coefficients. 

By  taking  into  consideration  the  changes  in  heat  transfer  coeffi- 
cients, we  get  nonlinear  models  whose  solutions  are  either  complex 
or  impossible.  Due  to  the  mentioned  difficulties  in  analysis  of 
simultaneous  processes  of  heat  and  mass  transfers,  mathematical 
modeling  of  the  process  was  done  by  application  of  experimental 
and  statistical  methods.  Testing  the  drying  process  in  a fluidized 
bed  of  moist  granules  Si02,  was  done  on  lab  equipment.  These  sys- 
tem factors  were  analyzed:  Xx-diameter  of  granules,  mm;  X2-length 
of  granules,  mm;  X3-air  flow  velocity,  m/s;  X4-moisture  of  gran- 
ules, % and  Xs-air  temperature,  °C.  The  design  matrix  with  out- 
comes of  design  points  is  shown  in  Table  2.158.  Experimental 
results  have  been  obtained  in  this  way;  Si02  granules  of  correspond- 
ing moisture  were  fed  into  a fluidized  column,  and  then  hot  air  was 
passed  through  the  formed  granule  bed  at  a certain  velocity.  Granule 
samples  were  taken  periodically  during  the  drying  process  to  deter- 
mine moisture.  From  results  of  such  measurements  moisture-time 
diagrams  were  constructed.  Current  moisture  and  drying  speed  in 
the  analyzed  point  have  been  determined  from  the  diagram  for  each 
designed  point.  Drying  speed  has  been  exactly  the  response  of  the 
experiment.  Determine  regression  coefficients  in  the  second-order 
mathematical  model. 


Table  2.1 58  CCRD  25  + 25  + 10 


No. 

trials 

Design  matrix 

Operational  matrix 

Drying 

speed 

X, 

X2 

X, 

x4 

X5 

Xl 

*2 

x3 

x4 

x5 

yu 

1 

+ 

+ 

+ 

+ 

+ 

5 

8 

3 

55 

289 

0.390 

2 

- 

+ 

+ 

+ 

+ 

3 

8 

3 

55 

289 

0.400 

3 

+ 

- 

+ 

+ 

+ 

5 

4 

3 

55 

289 

0.358 

4 

- 

- 

+ 

+ 

+ 

3 

4 

3 

55 

289 

0.503 

5 

+ 

+ 

- 

+ 

+ 

5 

8 

2.5 

55 

289 

0.075 

6 

- 

+ 

- 

+ 

+ 

3 

8 

2.5 

55 

289 

0.258 

7 

+ 

- 

- 

+ 

+ 

5 

4 

2.5 

55 

289 

0.308 

8 

- 

- 

- 

+ 

+ 

3 

4 

2.5 

55 

289 

0.260 

9 

+ 

+ 

+ 

- 

+ 

5 

8 

3 

35 

289 

0.508 

10 

- 

+ 

+ 

- 

+ 

3 

8 

3 

35 

289 

0.840 

Table  2.158  (continued) 
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No. 

trials 

Design  matrix 

Operational  matrix 

Drying 

speed 

X, 

X2 

X, 

X4 

X5 

Xl 

*2 

*3 

x4 

x5 

Yu 

11 

+ 

- 

+ 

- 

+ 

5 

4 

3 

35 

289 

0.580 

12 

- 

- 

+ 

- 

+ 

3 

4 

3 

35 

289 

0.790 

13 

+ 

+ 

- 

- 

+ 

5 

8 

2.5 

35 

289 

0.458 

14 

- 

+ 

- 

- 

+ 

3 

8 

2.5 

35 

289 

0.600 

15 

+ 

- 

- 

- 

+ 

5 

4 

2.5 

35 

289 

0.450 

16 

- 

- 

- 

- 

+ 

3 

4 

2.5 

35 

289 

0.500 

17 

+ 

+ 

+ 

+ 

- 

5 

8 

3 

55 

189 

0.240 

18 

- 

+ 

+ 

+ 

- 

3 

8 

3 

55 

189 

0.266 

19 

+ 

- 

+ 

+ 

- 

5 

4 

3 

55 

189 

0.233 

20 

- 

- 

+ 

+ 

- 

3 

4 

3 

55 

189 

0.265 

21 

+ 

+ 

- 

+ 

- 

5 

8 

2.5 

55 

189 

0.540 

22 

- 

+ 

- 

+ 

- 

3 

8 

2.5 

55 

189 

0.208 

23 

+ 

- 

- 

+ 

- 

5 

4 

2.5 

55 

189 

0.158 

24 

- 

- 

- 

+ 

- 

3 

4 

2.5 

55 

189 

0.202 

25 

+ 

+ 

+ 

- 

- 

5 

8 

3 

35 

189 

0.402 

26 

- 

+ 

+ 

- 

- 

3 

8 

3 

35 

189 

0.400 

27 

+ 

- 

+ 

- 

- 

5 

4 

3 

35 

189 

0.325 

28 

- 

- 

+ 

- 

- 

3 

4 

3 

35 

189 

0.475 

29 

+ 

+ 

- 

- 

- 

5 

8 

2.5 

35 

189 

0.500 

30 

- 

+ 

- 

- 

- 

3 

8 

2.5 

35 

189 

0.458 

31 

+ 

- 

- 

- 

- 

5 

4 

2.5 

35 

189 

0.533 

32 

- 

- 

- 

- 

- 

3 

4 

2.5 

35 

189 

0.400 

33 

+2.3 

0 

0 

0 

0 

6.378 

6 

2.75 

45 

239 

0.400 

34 

-2.3 

0 

0 

0 

0 

1.622 

6 

2.75 

45 

239 

0.600 

35 

0 

+2.3 

0 

0 

0 

4 

10.76 

2.75 

45 

239 

0.386 

36 

0 

-2.3 

0 

0 

0 

4 

1.24 

2.75 

45 

239 

0.475 

37 

0 

0 

+2.3 

0 

0 

4 

6 

3.345 

45 

239 

0.466 

38 

0 

0 

-2.3 

0 

0 

4 

6 

2.155 

45 

239 

0.333 

39 

0 

0 

0 

+2.3 

0 

4 

6 

2.75 

68.78 

239 

0.116 

40 

0 

0 

0 

-2.3 

0 

4 

6 

2.75 

21.22 

239 

0.710 

41 

0 

0 

0 

0 

+2.3 

4 

6 

2.75 

45 

358 

0.358 

42 

0 

0 

0 

0 

-2.3 

4 

6 

2.75 

45 

120 

0.375 

43 

0 

0 

0 

0 

0 

4 

6 

2.75 

45 

239 

0.316 

44 

0 

0 

0 

0 

0 

4 

6 

2.75 

45 

239 

0.183 

45 

0 

0 

0 

0 

0 

4 

6 

2.75 

45 

239 

0.235 

46 

0 

0 

0 

0 

0 

4 

6 

2.75 

45 

239 

0.291 

47 

0 

0 

0 

0 

0 

4 

6 

2.75 

45 

239 

0.325 

48 

0 

0 

0 

0 

0 

4 

6 

2.75 

45 

239 

0.361 

49 

0 

0 

0 

0 

0 

4 

6 

2.75 

45 

239 

0.359 

50 

0 

0 

0 

0 

0 

4 

6 

2.75 

45 

239 

0.230 

51 

0 

0 

0 

0 

0 

4 

6 

2.75 

45 

239 

0.361 

52 

0 

0 

0 

0 

0 

4 

6 

2.75 

45 

239 

0.350 
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Problem  2.25  [43] 

To  calculate  the  height  of  a SiC>2  drying  apparatus  or  fluidized  col- 
umn and  to  estimate  the  intensity  of  mass  and  heat  exchange,  it  is 
necessary  to  know  the  volume  of  the  fluidized  granule  bed  in  opera- 
tional state,  i.e.  fluidized  state.  Based  on  literature  data  about  fluidi- 
zation with  gases  at  increased  velocity  of  gas  flow,  a bed  of  fluidized 
granules  passes  from  calm  homogeneous  into  inhomogeneous  flui- 
dization. There  is  no  quantitative  dependence,  of  satisfactory  accu- 
racy, about  spreading  of  the  fluidized  bed.  Therefore  the  objective  of 
this  research  is  a study  of  spreading  a SiC>2  fluidized  granule  bed 
with  an  elaboration  of  a mathematical  model  of  the  process.  Prior 
studies  have  shown  that  the  process  may  be  mathematically 
described  only  in  this  temperature  interval:  250  - 400  °C.  Process 
factors  are:  Xi-air  temperature,  °C;  X2-air  velocity  (taken  at  20  °C), 
m/s;  X3-diameter  of  moist  granules,  mm  and  X4-moisture  of  gran- 
ules, %.  The  design  of  the  experiment  with  outcomes  is  shown  in 
Table  2.160. 

Table  2.1 60  CCRD24  + 2x4  + 7 


No. 

trials 

Sequence 
of  trials 

Design  matrix 

Operational  matrix 

y=H/H0 

x. 

X2 

X3 

x4 

*1 

*2 

x3 

x4 

1 

16 

- 

- 

- 

- 

250 

2.5 

3 

40 

5.6 

2 

11 

+ 

- 

- 

- 

350 

2.5 

3 

40 

8.8 

3 

8 

- 

+ 

- 

- 

250 

3 

3 

40 

7.8 

4 

24 

+ 

+ 

- 

- 

350 

3 

3 

40 

11.0 

5 

28 

- 

- 

+ 

- 

250 

2.5 

5 

40 

3.2 

6 

7 

+ 

- 

+ 

- 

350 

2.5 

5 

40 

6.12 

7 

25 

- 

+ 

+ 

- 

250 

3 

5 

40 

6.0 

8 

12 

+ 

+ 

+ 

- 

350 

3 

5 

40 

7.8 

9 

2 

- 

- 

- 

+ 

250 

2.5 

3 

60 

3.3 

10 

4 

+ 

- 

- 

+ 

350 

2.5 

3 

60 

3.7 

11 

6 

- 

+ 

- 

+ 

250 

3 

3 

60 

6.7 

12 

15 

+ 

+ 

- 

+ 

350 

3 

3 

60 

4.9 

13 

18 

- 

- 

+ 

+ 

250 

2.5 

5 

60 

2.4 

14 

23 

+ 

- 

+ 

+ 

350 

2.5 

5 

60 

3.0 

15 

20 

- 

+ 

+ 

+ 

250 

3 

5 

60 

3.9 

16 

24 

+ 

+ 

+ 

+ 

350 

3 

5 

60 

3.5 

17 

3 

-2 

0 

0 

0 

200 

2.75 

4 

50 

3.27 

18 

1 

+2 

0 

0 

0 

400 

2.75 

4 

50 

5.1 

19 

17 

0 

-2 

0 

0 

300 

2.25 

4 

50 

3.4 

20 

10 

0 

+2 

0 

0 

300 

3.25 

4 

50 

7.75 

21 

14 

0 

0 

-2 

0 

300 

2.75 

2 

50 

9.37 

22 

21 

0 

0 

+2 

0 

300 

2.75 

6 

50 

4.5 

23 

31 

0 

0 

0 

-2 

300 

2.75 

4 

30 

9.18 

24 

26 

0 

0 

0 

+2 

300 

2.75 

4 

70 

2.8 
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Table  2.160  (continued) 


No. 

trials 

Sequence 
of  trials 

Design  matrix 

Operational  matrix 

y=H/H0 

X, 

x2 

X3 

X4 

*1 

*2 

x3 

x4 

25 

9 

0 

0 

0 

0 

300 

2.75 

4 

50 

4.4 

26 

22 

0 

0 

0 

0 

300 

2.75 

4 

50 

3.8 

27 

19 

0 

0 

0 

0 

300 

2.75 

4 

50 

3.7 

28 

30 

0 

0 

0 

0 

300 

2.75 

4 

50 

4.0 

29 

5 

0 

0 

0 

0 

300 

2.75 

4 

50 

4.4 

30 

13 

0 

0 

0 

0 

300 

2.75 

4 

50 

4.0 

31 

29 

0 

0 

0 

0 

300 

2.75 

4 

50 

4.5 

Problem  2.26  [44] 

Mathematical  designing  of  an  experiment  has  been  applied  to  math- 
ematical modeling  of  solidification  and  hardness  of  concrete  as  a 
function  of  three  basic  factors:  Xi-cement  consumption,  kg/m3; 
X2-percentage  of  sand  in  filler  mixture,  % and  X3-water  consump- 
tion, 1/min.  This  parameter  was  measured  as  response:  yi-concrete 
solidification,  s.  The  cement  of  the  same  brand  and  the  sand  from 
the  same  supplier  have  been  used  in  all  design  points.  A mixture 
10 1 in  volume  was  mixed  manually  for  3 min  and  a 7 1 volume  for 
2.5  min.  Samples  of  10  x 10  x 10  cm  were  prepared  on  a vibration 
table  with  amplitude  of  0.45-0.50  mm,  frequency  of  2800  min  1 and 
under  pressure  of  80-100  kp/cm2.  Concrete  solidification  was  mea- 
sured 10-15  min  after  formation  of  samples  by  GOST  10181-62. 
Basic  experiment  was  done  by  FUFE  23,  as  shown  in  Table  2.161. 


Table 2.161  FUFE  23 


Variation  levels 

Design  matrix 

Response 

x3 

x2 

x3 

Basic  level 

525 

0.31 

189 

Variation  interval 

75 

0.09 

21 

Upper  level 

600 

0.4 

210 

Lower  level 

450 

0.22 

168 

Number  of  design  points 

X, 

X2 

x3 

Y 

1 

- 

- 

- 

160 

2 

- 

- 

+ 

66 

3 

- 

+ 

- 

58 

4 

- 

+ 

+ 

8 

5 

+ 

- 

- 

225 

6 

+ 

- 

+ 

42 

7 

+ 

+ 

- 

160 

8 

+ 

+ 

+ 

23 
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According  to  the  preliminary  information,  concrete  solidification 
is  the  nonlinear  function  of  the  composition.  Prove  it  by  processing 
FUFE  outcomes.  To  check  the  estimates  of  the  sum  of  coefficients 
next  to  square  members,  six  design  points  in  the  experimental  cen- 
ter were  done  and  these  outcomes  obtained:  45;  49;  44;  40;  42  and 
44.  Assert  the  hypothesis  that  regression  coefficients  next  to  square 
members  differ  from  zero  by  comparing  the  difference  b0— yu  with 
the  experimental  error.  This  means  that,  to  obtain  a second-order 
model  by  CCRD,  it  was  necessary  to  do  another  six  design  points  in 
starlike  points.  The  outcomes  of  all  20  design-points  are  given  in 
Table  2.162.  Determine  the  second-order  model. 


Table  2.162  CCRD  23  + 2 x 3 + 6 


X, 

x2 

X3 

X? 

X22 

X32 

x,x2 

x3x3 

x2x3 

y 

- 

- 

+ 

+ 

+ 

+ 

+ 

+ 

160 

- 

+ 

+ 

+ 

+ 

+ 

- 

- 

66 

+ 

- 

+ 

+ 

+ 

- 

+ 

- 

58 

+ 

+ 

+ 

+ 

+ 

- 

- 

+ 

8 

+ 

- 

- 

+ 

+ 

+ 

- 

- 

+ 

225 

+ 

- 

+ 

+ 

+ 

+ 

- 

+ 

- 

42 

+ 

+ 

- 

+ 

+ 

+ 

+ 

- 

- 

160 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

23 

-1.682 

0 

0 

2.828 

0 

0 

0 

0 

0 

35 

+1.682 

0 

0 

2.828 

0 

0 

0 

0 

0 

72 

0 

-1.682 

0 

0 

2.828 

0 

0 

0 

0 

137 

0 

+ 1.682 

0 

0 

2.828 

0 

0 

0 

0 

34 

0 

0 

-1.682 

0 

0 

2.828 

0 

0 

0 

261 

0 

0 

+1.682 

0 

0 

2.828 

0 

0 

0 

14 

0 

0 

0 

0 

0 

0 

0 

0 

0 

45 

0 

0 

0 

0 

0 

0 

0 

0 

0 

49 

0 

0 

0 

0 

0 

0 

0 

0 

0 

44 

0 

0 

0 

0 

0 

0 

0 

0 

0 

40 

0 

0 

0 

0 

0 

0 

0 

0 

0 

42 

0 

0 

0 

0 

0 

0 

0 

0 

0 

44 

Problem  2.27  [45] 

In  car  tire  development,  an  optimal  combination  of  three-compo- 
nent composition  has  been  researched:  Xi-hydrated  silicate,  PHR; 
X2-silan,  PHR  and  X3-sulfur,  PHR.  These  parameters  were  mea- 
sured as  system  responses:  yi-PICO  abrasion  index;  y2-200%  mod- 
ule; y3-strain  at  brake  and  y4-hardness.  The  experiment  has  been 
done  in  accord  with  CCRD.  Experimental  outcomes  are  shown  in 
Table  2.163.  Determine  regression  coefficients  values  for  all  four 
responses. 


Table  2.163  CCRD  23  + 2 x 3 + 6 
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ariation 

levels 

Design  matrix 

Responses 

Xl 

x2 

x3 

0 

1.2 

50 

2.3 

Ax 

0.5 

10 

0.5 

+ 

1.7 

60 

2.8 

- 

1.0 

40 

1.8 

No.  trials 

X, 

X2 

x3 

yi 

yi 

ys 

y+ 

1 

- 

- 

- 

102 

900 

470 

67.5 

2 

+ 

- 

- 

120 

860 

410 

65 

3 

- 

+ 

- 

117 

800 

570 

77.5 

4 

+ 

+ 

- 

198 

2294 

240 

74.5 

5 

- 

- 

+ 

103 

490 

640 

62.5 

6 

+ 

- 

+ 

132 

1289 

270 

67 

7 

- 

+ 

+ 

132 

1270 

410 

78 

8 

+ 

+ 

+ 

139 

1090 

380 

70 

9 

-1.68 

0 

0 

102 

770 

590 

76 

10 

+1.68 

0 

0 

154 

1690 

260 

70 

11 

0 

-1.68 

0 

96 

700 

520 

63 

12 

0 

+1.68 

0 

163 

1540 

380 

75 

13 

0 

0 

-1.68 

116 

2184 

520 

65 

14 

0 

0 

+1.68 

153 

1784 

290 

71 

15 

0 

0 

0 

133 

1300 

380 

70 

16 

0 

0 

0 

133 

1300 

380 

68.5 

17 

0 

0 

0 

140 

1145 

430 

68 

18 

0 

0 

0 

142 

1090 

430 

68 

19 

0 

0 

0 

145 

1260 

390 

69 

20 

0 

0 

0 

142 

1344 

390 

70 

2.3.3 

Orthogonal  Second-order  Design  (Box-Benken  Design) 

Orthogonal  designs  have  been  used  in  the  first  few  studies  having  to  do  with  the  ap- 
plication of  designed  experiments  in  obtaining  regression  models  of  the  second- 
order.  Although  much  more  optimized  designs  for  obtaining  second-order  models 
were  presented  in  the  previous  chapter,  second-order  orthogonal  design  is  still  used 
in  practice. 

The  expression  (2.74),  which  defines  the  condition  for  design  orthogonality  in  a 
general  case,  indicates  in  the  matrix  for  central  composite  second-order  design  that 
all  vector  columns  are  not  orthogonal.  To  provide  design  orthogonality  and  alleviate 
calculation  of  regression-equation  coefficients,  square  variables  are  transformed  and 
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suitable  distances  of  “starlike”  points  chosen.  Transformation  is  shown  by  the  expres- 
sion: 

N 


E4 
_1 

N 


= xL-zL 


(2.104) 


Thus  the  following  condition  is  fulfilled: 

N 

J2X0X'U  = 0 (2.105) 

l 

It  has  already  been  said  that  orthogonality  of  other  vectors-columns  provides  a 
selection  of  distances  of  “starlike”  points.  The  design  matrix  of  central  composite  or- 
thogonal designs  (CCOD)  is  obtained  at  different  number  of  factors  (lc),  by  upgrading 
the  associated  FUFE  designs  or  half-replicas  (for  k>5)  with  additional  points  in  the 
experimental  center  and  a corresponding  number  of  “starlike”  points.  The  number 
of  CCOD  design  points  with  values  of  “starlike”  distances  is  given  in  Table  2.164. 
Design  matrices  for  k=2-4  are  given  in  Tables  2.165-2.167. 


Table  2.164  Construction  of  orthogonal  designs 


No.  of  factors 
k 

No.  of  core 
points 
ni 

No.  of  starlike 
points 
la 

No.  of  null 
points 
no 

Coded  values 
a 

Total  no.  of 
design  points 
N 

2 

4 

4 

1 

1.000 

9 

3 

8 

6 

1 

1.215 

15 

4 

16 

8 

1 

1.414 

25 

5* 

16 

10 

1 

1.414 

27 

6* 

32 

12 

1 

- 

45 

* Core  of  half-replica  design 
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Table  2.165  Central  composite  orthogonal  design  22  + 2 x 2 + 1 


No. 

trials 

Design  matrix 

Square  factors 

Operational  matrix 

Response 

X0 

Xr 

x2 

x,x2 

/ ' \ 2 2 9 

(Xi  j = x,  | 

to 

II 

2 X 

3 Xi 

x2 

Yu 

1 

+ 

+ 

+ 

+ 

0.33 

0.33 

2.5 

120 

50 

2 

+ 

- 

+ 

- 

0.33 

0.33 

1.3 

120 

67 

3 

+ 

+ 

- 

- 

0.33 

0.33 

2.5 

20 

70 

4 

+ 

- 

- 

+ 

0.33 

0.33 

1.3 

20 

60 

5 

+ 

- 

0 

0 

0.33 

-0.67 

1.3 

70 

70 

6 

+ 

+ 

0 

0 

0.33 

-0.67 

2.5 

70 

56 

7 

+ 

0 

- 

0 

-0.67 

0.33 

1.9 

20 

73 

8 

+ 

0 

+ 

0 

-0.67 

0.33 

1.9 

120 

60 

9 

+ 

0 

0 

0 

-0.67 

-0.67 

1.9 

70 

62 

Sum  of 

9 

6 

6 

4 

2 

2 

squares 

In  accord  with  the  performed  transformation,  regression  coefficient  b0  is  deter- 
mined from  the  expression: 

b0  = b'0-  bnXi  - b12xl  - ...  - bkkxl  (2.106) 

and  a check  of  its  significance  is  estimated  by  variance: 

st0  = sl0  + xi  slu  + X2  Slu  + ...  + xl  s\  (2.107) 

Other  regression  coefficients  are  determined  in  accord  with  formulas  for  orthogo- 
nal designing.  A check  of  significance  of  regression  coefficients  is  given  in  Sect. 
2.4.1,  and  the  check  of  lack  of  fit  of  regression  model  is  in  Sect.  2.4.3. 


Example  2.51 

This  example  refers  to  response  dependence  on  two  factors  (k=2).  Orthogonal  sec- 
ond-order design  in  this  case,  according  to  Table  2.164,  has  nine  design  points 
(N=9).  The  design  matrix  with  outcomes  of  design  points  is  shown  in  Table  2.165. 
The  same  case  has  been  elaborated  in  the  previous  section,  in  Example  2.43,  by  ap- 
plication of  rotatable  second-order  design.  However,  the  connection  between  coded 
and  real  values  of  factors  for  the  same  null  point  is  now  different: 


0.6 


50 


(2.108) 
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Table  2.166  Central  composite  orthogonal  design  23  + 2 x 3 + 1 


No.  of 

design  points 

Design  matrix 

Response 

Yu 

X„ 

x. 

X2 

x3 

(*)’ 

(E 

W* 

x,x2 

x,x3 

X2X3 

1 

+ 

+ 

+ 

+ 

0.270 

0.270 

0.270 

+ 

+ 

+ 

82 

2 

+ 

- 

+ 

+ 

0.270 

0.270 

0.270 

- 

- 

+ 

82 

3 

+ 

+ 

- 

+ 

0.270 

0.270 

0.270 

- 

+ 

- 

42 

4 

+ 

- 

- 

+ 

0.270 

0.270 

0.270 

+ 

- 

- 

70 

5 

+ 

+ 

+ 

- 

0.270 

0.270 

0.270 

+ 

- 

- 

60 

6 

+ 

- 

+ 

- 

0.270 

0.270 

0.270 

- 

+ 

- 

80 

7 

+ 

+ 

- 

- 

0.270 

0.270 

0.270 

- 

- 

+ 

48 

8 

+ 

- 

- 

- 

0.270 

0.270 

0.270 

+ 

+ 

+ 

70 

9 

+ 

-1.215 

0 

0 

0.747 

-0.730 

-0.730 

0 

0 

0 

80 

10 

+ 

+1.215 

0 

0 

0.747 

-0.730 

-0.730 

0 

0 

0 

60 

11 

+ 

0 

-1.215 

0 

-0.730 

0.747 

-0.730 

0 

0 

0 

54 

12 

+ 

0 

+1.215 

0 

-0.730 

0.747 

-0.730 

0 

0 

0 

88 

13 

+ 

0 

0 

-1.215 

-0.730 

-0.730 

0.747 

0 

0 

0 

85 

14 

+ 

0 

0 

+1.215 

-0.730 

-0.730 

0.747 

0 

0 

0 

74 

15 

+ 

0 

0 

0 

-0.730 

-0.730 

-0.730 

0 

0 

0 

70 

Sum  of  squares 

15 

10.94 

10.94 

10.94 

4.34 

4.34 

4.34 

8 

8 

8 

This  may  be  explained  by  the  necessity  of  changing  factor- variation  intervals.  Lin- 
ear regression  coefficients  are  calculated  by  the  formula  (2.62),  which  is: 

6 

EXiuYu 


b;  = -J- 


(2.109) 


so  that: 


1 


Z>!  = 1 = - (50  - 67  + 70  - 60  - 70  + 56)  = -3.50 

6 6 

6 


b,  = 


= -4.33 


Coefficient  b12,  which  characterizes  the  effect  of  even  interaction  is  determined  in 
accord  with  the  formula 

4 

E xix2% 

bu  = — = 


4 


^-(50-  67-  70  + 60)  = -6.75. 
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To  determine  coefficients  bn  and  bu,  we  use: 


N 

Z(K 

b«  = 1 


|2y« 


Z(K 


(2.110) 


Z(xL)2yu 

So  that: 

bn=0.5[0.33(50+67+70+60+70+56)-0.67(73+60+62)]=-3.78; 

b22=0.5[0.33(50+67+70+60+73+60)-0.67(70+56+62)]=-0.30. 

To  calculate  b0  we  have  to  determine  b'0 : 

N N 

Z Yu  E Yu 


(2.111) 


ZK 

1 

9 

Eyu 


N 


bo  = 


= 568  = 63.11 


(2.112) 


From  Eq.  (2.106)  we  get: 

b0  = bo-  bnX i - b22xl 


where: 


Xi  = 


N 

EE2 

_1 

N 


9 

EE2 

i 


(2.113) 


so  that:  b0=foo-0.67blr0.67b22=65.84 

A check  of  statistical  significance  of  regression  coefficients  (Sect.  2.4.2),  indicates 
that  regression  coefficients  bn  and  b22  are  statistically  insignificant.  The  final  form 
of  the  second-order  regression  model  with  95%  confidence  may  be  given  in  the 
form: 

y h=65.84-3.50  Xr4.33  X2-6.75  X2X2  (2.114) 

Lack  of  fit  of  the  obtained  regression  model  is  checked  by  formulas  from  Sect. 
2.4.3  for  the  variant  of  an  identical  number  of  replications  of  all  the  design  points 
(n=25).  The  obtained  outcomes  are  given  in  Example  2.63  and  Example  2.65. 


Example  2.52 

The  second-order  orthogonal  design  for  k=3,  is  shown  in  Table  2.166.  Based  on 
design-point  outcomes,  we  can  calculate  regression  coefficients: 
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10 

ExJu 

bi  ~ 10.94 


bn 


8 


VX.  X.  y. 

J. 

8 


Exhu 


15 

E Yu 


b1:  = ■ 


b'0=^~ 


4.34  15 

b0  = b'0-  0.73 bn  - 0.73 b22  - 0.73b33. 

After  processing  the  outcomes,  we  obtain  the  regression  model: 

yM=75.6-8.6X1+10.5X2+0.4X3-5.1Xi2-4.4X2+1.3X32+3.7X1X2+1.7X1X3+3.7X2X3 

The  obtained  regression  model  is  adequate  with  95%  confidence.  A check  of  sig- 
nificance of  regression  coefficients,  in  accord  with  Sect.  2.4.2,  is  completed  thus: 

sl  = h. ■ si  = sv  ■ sl  = %-■  sl  = . 

b°  15’  b‘  10.94’  bh‘j  8’  h“  4.34’ 

If  sl  = 15,  then: 


S2h  = 1.00;  S2b.  = 1.36;  S2b..  = 1.87;  sl  = 3.45. 


Then: 

Ab0=±2.0;  AbP±2.34;  Abr±2.74;  AbiP±3.70. 

With  95%  confidence  then: 

yu  = 75.6  - 8.61Xj  + 10.5X2  - 5. IX2  - 4.4X22  + 3.7XxX2  + 3.7X2X3  (2.115) 


Example  2.53 

The  orthogonal  design  for  four  factors  (lc=4)  is  given  in  Table  2.167  with  outcomes  of 
design  points.  The  regression  coefficients  are  determined  thus: 

18  16  25 

E xiJu 
h = - 1 


EX.  X.  yu 

IU  JUIU 


Exlvu 


25 

Ey» 


b«  = -J- 


20  16  " 8 
b0  = b'0  — O.SOfojj  — 0.80  fe22  — 0.80 b33  — 0.80fo44 


b'0= -1 


25 


The  second-order  regression  model  has  the  following  form: 
yij=82.8+18.4X1+17.3X2-22.6X3+30.7X4-6.9Xi2+1.4X22+2.4X3 

-0.5X4+1.5X1X2-5.6X1X3+6.1X1X4-1.4X2X3+5.3X2X4-7.0X3X4 


(2.116) 
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Table  2.167  Central  composite  orthogonal  design  24  + 2 x 4 + 1 


No. 

trials 

Design  matrix 

Response 

Predicted 

response 

X0 

X, 

x2 

X, 

x4 

/ r\ 2 

(Xl) 

/ /\  2 

[X2j 

/ t \ 2 

[X3j 

/ r\ 2 

(x4J 

<N 

X 

>< 

Yu 

Yu 

1 

+ 

+ 

+ 

+ 

+ 

0.2 

0.2 

0.2 

0.2 

+ 

118 

122 

2 

+ 

- 

+ 

+ 

+ 

0.2 

0.2 

0.2 

0.2 

- 

80 

81 

3 

+ 

+ 

- 

+ 

+ 

0.2 

0.2 

0.2 

0.2 

- 

78 

76 

4 

+ 

- 

- 

+ 

+ 

0.2 

0.2 

0.2 

0.2 

+ 

42 

42 

5 

+ 

+ 

+ 

- 

+ 

0.2 

0.2 

0.2 

0.2 

+ 

200 

195 

6 

+ 

- 

+ 

- 

+ 

0.2 

0.2 

0.2 

0.2 

- 

128 

132 

7 

+ 

+ 

- 

- 

+ 

0.2 

0.2 

0.2 

0.2 

- 

145 

144 

8 

+ 

- 

- 

- 

+ 

0.2 

0.2 

0.2 

0.2 

+ 

88 

87 

9 

+ 

+ 

+ 

+ 

- 

0.2 

0.2 

0.2 

0.2 

+ 

52 

52 

10 

+ 

- 

+ 

+ 

- 

0.2 

0.2 

0.2 

0.2 

- 

33 

35 

11 

+ 

+ 

- 

+ 

- 

0.2 

0.2 

0.2 

0.2 

- 

30 

28 

12 

+ 

- 

- 

+ 

- 

0.2 

0.2 

0.2 

0.2 

+ 

14 

17 

13 

+ 

+ 

+ 

- 

- 

0.2 

0.2 

0.2 

0.2 

+ 

95 

97 

14 

+ 

- 

+ 

- 

- 

0.2 

0.2 

0.2 

0.2 

- 

58 

58 

15 

+ 

+ 

- 

- 

- 

0.2 

0.2 

0.2 

0.2 

- 

70 

67 

16 

+ 

- 

- 

- 

- 

0.2 

0.2 

0.2 

0.2 

+ 

37 

34 

17 

+ 

-1.414 

0 

0 

0 

1.2 

-0.8 

-0.8 

-0.8 

0 

48 

43 

18 

+ 

+1.414 

0 

0 

0 

1.2 

-0.8 

-0.8 

-0.8 

0 

90 

95 

19 

+ 

0 

-1.414 

0 

0 

-0.8 

1.2 

-0.8 

-0.8 

0 

55 

61 

20 

+ 

0 

+1.414 

0 

0 

-0.8 

1.2 

-0.8 

-0.8 

0 

116 

110 

21 

+ 

0 

0 

-1.414 

0 

-0.8 

-0.8 

1.2 

-0.8 

0 

115 

120 

22 

+ 

0 

0 

+1.414 

0 

-0.8 

-0.8 

1.2 

-0.8 

0 

60 

56 

23 

+ 

0 

0 

0 

-1.414 

-0.8 

-0.8 

-0.8 

1.2 

0 

38 

38 

24 

+ 

0 

0 

0 

+1.414 

-0.8 

-0.8 

-0.8 

1.2 

0 

125 

125 

25 

+ 

0 

0 

0 

0 

-0.8 

-0.8 

-0.8 

-0.8 

0 

84 

83 

Sum  of 
squares 

25 

20 

20 

20 

20 

8 

8 

8 

8 

16 

A check  of  lack  of  fit  of  the  regression  model  (2.116),  is  done  in  accordance  with 
the  formulas  from  Sect.  2.4.3,  where  all  design  points  are  replicated  the  same  num- 
ber of  times  (n=25).  The  obtained  predicted  response  values  of  the  regression  model 
are  also  given  in  Table  2.167.  Variance  of  lack  of  fit  is  calculated  thus: 


Ew(y«-y„) 

S2ad=  — 


N— 


(k+2)(k+l) 


25x267 

25-15 


= 667.5 


The  arithmetic  value  of  Fisher’s  criterion  is  determined  for  Sv=375.0. 
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667.5 

375.0 


1.78 


A tabular  value  FT=1.85  is  obtained  for  fAD=10  and  fE=25(25-l)=600  from  Table  E. 
The  regression  model  is  with  95%  confidence  adequate  because  FR<FT.  To  estimate 
the  statistical  significance  of  regression  coefficients,  use  formulas  from  Sect.  2.4.2. 


c2  =3L  C2  =M.  c2 

b°  25  ’ h‘  20  ’ h‘J 


So  that: 


S2bo  = 0.60;  S2b.  = 0.75;  S?  = 0.94;  si..  = 1.87. 

Abo=±2x0.77=±1.54;  Abp±2x0.87=±1.74;  Abr±2x0.97=±1.94;  Abif=+2xl.37=+2.74. 
Having  taken  the  given  values  into  consideration,  the  regression  model  becomes: 
yu=82.8+18.4X1+17.3X2-22.6X3+30.7X4-6.9Xi +6.1X1X4-5.6X1X3+5.3X2X4-7.0X3X4 

(2.117) 


Table  2.168  Sums  of  squares  of  columns 


Number 
of  factors 
k 

Value  of  sum 

N 

E xfu 
1 

b0 

bi 

bi, 

bii 

2 

9 

6 

4 

2 

3 

15 

10.94 

8 

4.34 

4 

25 

20 

16 

8 

5 

27 

20 

16 

8 

To  calculate  regression  coefficients  more  efficiently,  sums  of  squares  of  elements 
in  design  matrix  columns  are  shown  in  Table  2.168. 

Example  2.54  [12] 

The  oxidation  process  of  sodium  hypophosphite  with  iron  has  been  studied.  To 
establish  conditions  of  quantitative  oxidation  of  NaH2P02  with  iron  and  the  possible 
reaction  surface,  an  experiment  has  been  defined  for  obtaining  a mathematical 
model  of  the  process.  The  experimental  objective  deals  with  the  conditions  for 
obtaining  100%  oxidation  of  NaH2P02. 

In  accordance  with  the  data  from  the  literature,  the  reaction  in  a constant  volume 
of  100  ml  and  at  the  boiling  point  temperature  is  defined  by  these  three  factors: 

X3  iron  concentration; 

X2  acidity  and 
X3  oxidation  time. 
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All  design  points  are  done  in  100  ml  volume  with  20.14  mg  of  NaH2P02H20. 

The  reaction  has  been  interrupted  by  a sudden  cooling  of  the  solution,  after  which 
the  quantity  of  oxidized  hypophosphite  was  determined.  The  system  response  is  the 
rate  of  sodium  hypophosphite  oxidation.  The  suggestion  is  to  apply  CCOD  to  obtain 
a second-order  mathematical  model.  The  experiment  has  been  designed  according 
to  orthogonal  design,  after  optimization  by  the  steepest  ascent  method  and  with  the 
design  center  in  the  design  point,  which  had  the  greatest  oxidation  rate  of  NaH2P02. 

The  design  of  experiment  with  outcomes  of  design  points  is  given  in  Table  2.169. 


Table  2.169 

CCOD  23 

+2x3+1 

Variation  levels 

Factors 

x. 

X2 

x3 

Basic  level 

0.032 

1.0 

15 

Variation  interval 

0.005 

0.5 

5 

No. 

trials 

Design  matrix 

Response 

X0 

X, 

X2 

x3 

Yu 

Yu 

1 

+ 

- 

- 

- 

96.18 

94.19 

2 

+ 

+ 

- 

- 

97.88 

97.65 

3 

+ 

- 

+ 

- 

92.96 

93.61 

4 

+ 

+ 

+ 

- 

98.34 

97.55 

5 

+ 

- 

- 

+ 

97.36 

98.47 

6 

+ 

+ 

- 

+ 

98.18 

98.93 

7 

+ 

- 

+ 

+ 

95.24 

94.89 

8 

+ 

+ 

+ 

+ 

99.32 

98.83 

9 

+ 

-1.215 

0 

0 

98.30 

97.02 

10 

+ 

+1.215 

0 

0 

98.40 

99.69 

11 

+ 

0 

-1.215 

0 

99.78 

97.68 

12 

+ 

0 

+1.215 

0 

94.53 

95.45 

13 

+ 

0 

0 

-1.215 

97.34 

97.57 

14 

+ 

0 

0 

+1.215 

99.24 

99.13 

15 

+ 

0 

0 

0 

99.08 

98.55 
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N 


10 

£X1  uYu 

b , = — 

1 10.94 

= -1—  (-96.18  + 97.88  - 92.96  + 98.34  - 97.36  + 98.18  - 95.24  + 99.32  - 1.215 
10.94 

x 98.30  — 1.215  x 98.40)  = 1.11 

10 

EX2  uY“  , 

b2  = 1 = (-96.18-97.88+92.96+98.34-97.36-98.18 

2 10.94  10.94' 

+95. 24+99. 32-1. 21599. 78+1. 21594. 53)=-0.92 
10  ' 

EEuFk 

b,  = 1 = (-96.18-97.88-92.96-98.34+97.36+98.18 

3 10.94  10.94' 

g +95. 24+99. 32-1. 21597. 34+1. 21599. 24)=0.64 

VI  X,  y„ 

t 1 u 2 ulu  i 

b12  = 1 = -(96. 18-97.88-92. 96+98. 34+97. 36-98. 18-95. 24+99. 32)=0. 87 

8 8 

8 

yx.  X , ytt 

4-—/  lu  3 u/W 

b13  = 1 = -(96. 18-97.88+92. 96-98. 34-97. 36+98. 18-95. 24+99. 32)=-0.27 

8 8 

8 

4-—/  2u  3w-*m 

b23  = -1 = -(96. 18+97.88-92. 96-98. 34-97. 36-98. 18+95. 24+99. 32)=0. 22 

8 8 

15 

E^iuiu 

bu  = = (0.27096.18+0.27097.88+0.27092.96+0.27098.34 

11  4.34  4.34' 

+0.27097.36+0.27098.18+0.27095.24+0.27099.32+0.74798.30+0.74798.40 
-0.73099. 78-0.73094.53-0.73097. 34-0. 73099. 24-0. 73099. 08)=-0.32 

15 

EX2  uYu 

b22  = = (0.27096.18+0.27097.88+0.27092.96+0.27098.34 

22  4.34  4.34' 

+0.27097.36+0.27098.18+0.27095.24+0.27099.32-0.73098.30-0.73098.40 
+0.74799. 78+0. 74794.53-0.73097. 34-0.73099. 24-0.73099.08)=-!.  12 
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15 

EX3  uYu 

= = (0.27096. 18+0. 27097. 88+.. .+0.27099. 32-0.73098. 30 

4.34  4.34 

-0.73098.40-0.73099.78-0.73094.53+0.74797.34+0.74799.24-0.73099.08) 

=-0.36. 

b0  = b'0  - bnxi  - b21xl  - fe33X3  =97.48-0. 73(-0.32)-0.73(-1.12)-0.73(-0.36)=98. 79 

A check  of  significance  of  regression  coefficients  shows  that  with  95%  confidence 
the  following  regression  coefficients  are  statistically  significant:  b0;  bi;  b2;  b3;  b12  and 
b22.  The  second-order  regression  model  has  this  form: 

yH=98.79+l.llX1-0.92X2+0.64X3+0.87X1X2-1.12XX22  (2.118) 

Example  2.55  [38] 

The  conditions  for  maximal  dissolving  of  borate  in  a mixture  of  sulfurous  and  phos- 
phorous acids  are  investigated.  The  rate  of  dissolving  borate  (y)  is  controlled  by  vary- 
ing these  factors:  Xrtemperature  of  reaction,  °C;  X2-reaction  time,  min;  X3-stoichio- 
metric  ratio  of  phosphorous  acid  in  per  cent,  % and  X4-concentration  of  phosphor- 
ous acid  as,  % P2Os. 

The  basic  levels  and  variation  intervals  are  given  in  Table  2.170.  It  is  known  from 
previous  design  points  that  the  optimum  is  within  the  studied  factor  space.  Orthog- 
onal design  has  therefore  been  done  to  obtain  the  regression  model,  Table  2.171. 
Reproducibility  variance  of  the  experiment  is  determined  from  four  additional 
design  points  in  the  experimental  center  (y0i=61.8;  y02=59.3;  y03=58.7  and  y04=69.0). 

Table  2.170  Factor-variation  intervals 


Factors 

Variation  levels 

Variation  inter- 
vals 
Ax 

-1.414 

-1.0 

0 

+1.0 

+1.414 

xx -temperature 

19.75 

30 

55 

80 

90.25 

25 

x2-time 

5.78 

15 

37.5 

60 

69.23 

22.5 

x3-stechiometric  ratio 

51.80 

60 

80 

100 

108.20 

20 

x4-concentration  of  H3P04 

6.29 

14 

32.8 

51.60 

59.31 

18.8 

Eyoi  E(yoi-Fo)2 

y0=^— = 60.95;  Sy  = ^ — =5.95;  / = 4-1 

4 n0  i 

Regression  coefficients  have  the  values: 
b0'=61.54  bi=17.37  b2=6.4  b3=4.7  b4=-4.37 

b12=2.18  b13=0.2  b14=1.2  b23=0.56  b24=0.79 

b22=1.3  b33=4.09. 


3. 


b44 — 5.34 
b„=4.5 
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After  rejecting  statistically  insignificant  regression  coefficients,  the  regression 
model  becomes: 

y=61. 54+17. 37X!+6. 4X2+4.7X3-4.37X4+2. 18X1X2+1.9X3X4+4.5(Xi -0.8) 

+4.09(X3  -0.8)-5.34(X4  -0.8)=58.9+17.37X1+6.4X2+4.7X3-4.37X4  (2.119) 

+2. 18X3X2+1. 9X3X4+4.5Xi  +4.09X3  -5.34X4 


Table  2.1 71  CCOD24  + 2x4  + l 


No. 

trials 

X0 

X, 

X2 

X3 

X4  1 

/ /\  2 

fxij 

/ / \ 2 

(x2J 

/ /\  2 

r3) 

/ / \ 2 

>< 

X 

X 

X 

X 

X 

X2X3 

X2x4 

x: 

X 

y 

1 

+ 

+ 

+ 

+ 

+ 

0.2 

0.2 

0.2 

+ 

+ 

+ 

+ 

+ 

+ 

86.9 

2 

+ 

- 

+ 

+ 

+ 

0.2 

0.2 

0.2 

+ 

- 

- 

- 

- 

+ 

40.0 

3 

+ 

+ 

+ 

- 

+ 

0.2 

0.2 

0.2 

- 

- 

+ 

+ 

- 

- 

66.0 

4 

+ 

- 

+ 

- 

+ 

0.2 

0.2 

0.2 

- 

+ 

- 

- 

+ 

- 

34.4 

5 

+ 

+ 

+ 

+ 

- 

0.2 

0.2 

0.2 

- 

+ 

- 

- 

+ 

- 

76.8 

6 

+ 

- 

+ 

+ 

- 

0.2 

0.2 

0.2 

- 

- 

+ 

+ 

- 

- 

55.7 

7 

+ 

+ 

+ 

- 

- 

0.2 

0.2 

0.2 

+ 

- 

- 

- 

- 

+ 

91.0 

8 

+ 

- 

+ 

- 

- 

0.2 

0.2 

0.2 

+ 

+ 

+ 

+ 

+ 

+ 

47.6 

9 

+ 

+ 

- 

+ 

+ 

0.2 

0.2 

0.2 

- 

+ 

+ 

- 

- 

+ 

74.1 

10 

+ 

- 

- 

+ 

+ 

0.2 

0.2 

0.2 

- 

- 

- 

+ 

+ 

+ 

52.0 

11 

+ 

+ 

- 

- 

+ 

0.2 

0.2 

0.2 

+ 

- 

+ 

- 

+ 

- 

74.5 

12 

+ 

- 

- 

- 

+ 

0.2 

0.2 

0.2 

+ 

+ 

- 

+ 

- 

- 

29.6 

13 

+ 

+ 

- 

+ 

- 

0.2 

0.2 

0.2 

+ 

+ 

- 

+ 

- 

- 

94.8 

14 

+ 

- 

- 

+ 

- 

0.2 

0.2 

0.2 

+ 

- 

+ 

- 

+ 

- 

49.6 

15 

+ 

+ 

- 

- 

- 

0.2 

0.2 

0.2 

- 

- 

- 

+ 

+ 

+ 

68.6 

16 

+ 

- 

- 

- 

- 

0.2 

0.2 

0.2 

- 

+ 

+ 

- 

- 

+ 

51.8 

17 

+ 

0 

0 

0 

0 

-0.8 

-0.8 

-0.8 

-0.8 

0 

0 

0 

0 

0 

0 

61.8 

18 

+ 

+1.414 

0 

0 

0 

1.2 

-0.8 

-0.8 

-0.8 

0 

0 

0 

0 

0 

0 

95.4 

19 

+ 

-1.414 

0 

0 

0 

1.2 

-0.8 

-0.8 

-0.8 

0 

0 

0 

0 

0 

0 

41.7 

20 

+ 

0 

+1.414 

0 

0 

-0.8 

1.2 

-0.8 

-0.8 

0 

0 

0 

0 

0 

0 

79.0 

21 

+ 

0 

-1.414 

0 

0 

-0.8 

1.2 

-0.8 

-0.8 

0 

0 

0 

0 

0 

0 

42.4 

22 

+ 

0 

0 

+1.414 

0 

-0.8 

-0.8 

1.2 

-0.8 

0 

0 

0 

0 

0 

0 

77.6 

23 

+ 

0 

0 

-1.414 

0 

-0.8 

-0.8 

1.2 

-0.8 

0 

0 

0 

0 

0 

0 

58.0 

24 

+ 

0 

0 

0 

+1.414 

-0.8 

-0.8 

-0.8 

1.2 

0 

0 

0 

0 

0 

0 

45.6 

25 

+ 

0 

0 

0 

-1.414 

-0.8 

-0.8 

-0.8 

1.2 

0 

0 

0 

0 

0 

0 

52.3 
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Example  2.56  [21] 

This  example  refers  to  the  research  of  this  chemical  reaction: 

A+B  — > C+D 

where  two  reactants,  A and  B,  give  the  mixture  of  product  C and  D.  The  final  prod- 
uct of  the  chemical  reaction  gives  mixture  C and  D and  unchanged  components  A 
and  B.  The  experimental  objective  is  to  get  maximal  yield  of  components  C,  under 
the  condition  that  D’s  content  does  not  exceed  20%  (more  than  20%  of  the  compo- 
nent D causes  difficulties  in  refining).  The  research  included  variations  of  these  fac- 
tors: Xrtemperature,  °C;  X2-initial  concentration  of  reactant  A,  %;  X3-reaction  time, 
h.  The  contents  of  component  B have  been  maintained  constant.  Prior  research  of 
this  chemical  reaction  indicates  the  existence  of  interactions  between  the  analyzed 
factors.  Therefore,  FUFE  23  with  the  following  factor-variation  levels  was  selected  as 
the  basic  design  of  experiment: 

Table  2.1 72  Factor  variation  intervals 


Factors  Variation  levels  Variation  intervals 

Ax 


- 

0 

+ 

xx -temperature 

142 

147 

152 

5 

x2-concentration  C 

35 

37.5 

40 

2.5 

x3-reaction  time 

7 

8.5 

10 

1.5 

The  FUFE  design  with  outcomes  is  given  in  Table  2.173. 


Table  2.173  FUFE  23 


No. 

Design  matrix 

Response  yield  C % 
Yu 

x. 

X2 

x3 

1 

- 

- 

- 

55.9 

2 

- 

- 

+ 

63.3 

3 

- 

+ 

- 

67.5 

4 

- 

+ 

+ 

68.8 

5 

+ 

- 

- 

70.6 

6 

+ 

- 

+ 

68.0 

7 

+ 

+ 

- 

68.6 

8 

+ 

+ 

+ 

62.4 

By  applying  formulas  (2.62)  to  (2.64)  we  get: 

b0=65.64;  bi=1.76;  b2=1.19;  b3=-0.01; 

b12=-3.09;  b13=-2.19  b23=-1.21  and  b123=0.31. 
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It  is  evident  that  the  obtained  effects  of  three  even  interactions  are  large  when 
compared  to  linear  effects.  This  means  that  the  response  surface  is  curved  and  that 
the  optimum  is  near.  To  describe  the  optimum  by  a second-order  model  it  is  neces- 
sary, in  accord  with  CCOD,  to  upgrade  the  basic  FUFE  23  to  CCOD.  Seven  addi- 
tional design  points  are  needed  for  this:  Six  in  starlike  points  and  one  in  the  design 
center.  Variation  levels  for  additional  design  points  are  given  in  Table  2.174. 

Table  2.174  Factor-variation  intervals 


Factors 

Variation  levels 

-2 

0 

+2 

xx -temperature 

137 

147 

157 

x2-concentration  C 

32.5 

37.5 

42.5 

x3-reaction  time 

5.5 

8.5 

11.5 

The  design  matrix  for  additional  design  points  with  experimental  outcomes  is  giv- 
en in  Table  2.175. 


Table  2.175  Additional  design  points  for  CCOD 


No. 

Design  matrix  Response  yield  C % 

y u 

X,  X2  X3 

9 

0 

0 

0 

66.9 

10 

2 

0 

0 

65.4 

11 

-2 

0 

0 

56.9 

12 

0 

2 

0 

67.5 

13 

0 

-2 

0 

65.0 

14 

0 

0 

2 

68.9 

15 

0 

0 

-2 

60.3 

Six  design  points  in  the  experimental  center  had  to  be  done  due  to  Table  2.164. 
For  the  sake  of  economy  only  one  design  point  in  the  experimental  center  was  set  in 
this  example,  since  the  reproducibility  variance  was  obtained  in  the  basic  experi- 
ment. By  processing  all  15  design  points,  the  following  regression  coefficients  for 
second-order  model  were  obtained: 

b0=67.711;  b1=1.944;  b2=0.906;  b3=1.069;  Bu — 1.539; 

b22=-0.264;  b33=-0.676;  b12=-3.088;  b13=-2.188;  B23=-1.212. 

The  second-order  regression  model  has  this  form: 

y = 67.711  + 1.944Xj  + 0.906X,  + 1.069X,  - 1.539X? 

- 0.264X2  - 0.676X3  - 3.088XjX2  - 2.188X1X3  - 1.212X2X3 


(2.120) 
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2.3.4 

D-optimality,  Bk-designs  and  Hartley’s  Second-order  Designs 

D-optimality  designs  are  very  attractive  for  researchers,  as  their  application  provides 
maximal  accuracy  in  estimating  regression  coefficients.  The  price  for  higher  accu- 
racy is  paid  by  an  increased  number  of  design  points,  which  explains  why  such 
designs  are  less  frequently  used  in  practice.  The  continuous  D-optimality  design  [34], 
constructed  on  a k-dimension  cube,  has  this  formula  for  the  number  of  design 
points  [34]: 

Ar  k , , Jc- 1 , fc(fc-l)  Jc- 2 

N=  2 +fex  2 — -x  2 (2.121) 

where: 

k-is  the  number  of  factors. 

In  the  given  case,  experimental  points  are  in  hypercube  crowns,  in  the  middle  of 
the  edges  and  in  the  center  of  two-dimensional  planes. 

Kono’s  designs  [46]  are  attempts  to  reduce  the  number  of  design  points  in  contin- 
uous D-optimality  designs  (constructed  on  a hypercube),  by  replacing  all  points  in 
centers  of  two-dimensional  planes  with  one  point  in  the  hypercube  center.  The 
number  of  design  points  by  Kono’s  designs  is  defined  by  this  expression: 

N = 2h  + kx2k^  + 1 (2.122) 

Tables  with  practical  formulas  for  calculating  regression  coefficients  in  Kono’s 
designs  are  known  [46].  The  design  matrix  and  coefficients  for  calculating  regres- 
sion coefficients  for  Kono’s  k=2  and  k=3  designs  are  given  in  Tables  2.176  and  2.177. 
Analogous  data  for  k=4  are  given  in  reference  [46].  This  reference  contains  a disper- 
sion matrix  that  may  serve  to  check  the  statistical  significance  of  regression  coefficients 
for  Kono’s  designs  with  different  numbers  of  factors.  Let  us  analyze  an  actual  applica- 
tion of  Kono’s  design  for  obtaining  a second-order  regression  model  for  k=2.  This  experi- 
ment includes  nine  design  points,  with  the  design  shown  in  Table  2.176. 


Table  2.1 76  Kono’s  design  22  + 2x22  1 + 1 


No. 

trials 

X, 

x2 

Coefficients 

Yu 

Yu 

(v-  " Xu)' 

b0 

bn 

b22  b, 

b2 

b,2 

1 

0 

0 

0.5772 

-0.3234 

-0.3234  0 

0 

0 

62 

62.4 

5.76 

2 

+ 

+ 

-0.1057 

0.1691 

0.1691  0.1961 

0.1961 

0.25 

50 

48.5 

2.25 

3 

- 

+ 

-0.1057 

0.1691 

0.1691  -0.1961 

0.1961 

-0.25 

67 

67.86 

0.64 

4 

- 

- 

-0.1057 

0.1691 

0.1691  -0.1961 

-0.1961 

0.25 

60 

62.2 

4.84 

5 

+ 

- 

-0.1057 

0.1691 

0.1691  0.1961 

-0.1961 

-0.25 

70 

70.0 

0 

6 

+ 

0 

0.2114 

0.1617 

-0.3383  0.1078 

0 

0 

56 

58.6 

6.76 

7 

0 

+ 

0.2114 

-0.3383 

0.1617  0 

0.1078 

0 

60 

61.1 

1.21 

8 

- 

0 

0.2114 

0.1617 

-0.3383  -0.1078 

0 

0 

70 

64.4 

31.36 

9 

0 

- 

0.2114 

-0.3383 

0.1617  0 

-0.1078 

0 

73 

71.8 

1.44 

Sum 

54.26 
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Kono’s  design  matrix  corresponds  to  second-order  orthogonal  design  matrices  so 
that  in  this  case  experimental  outcomes  from  Example  2.51,  Table  2.165,  were  used. 
The  value  of  regression  coefficient  b0  is  determined  by  adding  coefficient  yields  of 
the  associated  column  to  the  response  column: 

b0=0. 5772x62-0. 1057x50-0.1057x67-0. 1057x60-0. 1057x70+0. 21 14x56 
+0.2114x60+0.2114x70+0.2114x73=64.43 

Values  of  other  regression  coefficients  are  determined  in  the  same  way,  so  that 
the  regression  model  becomes: 

y=64.43-2.88X1-3.95X2-6.80X1X2-2.90X12  + 0.60X22  (2.123) 


To  check  lack  of  fit  of  the  regression  model,  its  variance  is  determined: 

9 / ^ \2 

£25(yM-y„) 


Sad=~ 


25x54.26 


9-6 


= 452.17 


Where  the  reproducibility  variance  is  Sy=375.0,  the  arithmetic  value  of  Fisher’s  cri- 
terion is:  Fr=1.21. 


Table  2.1 77  Kono's  design  23  + 3x23  1 + 1 


No. 

trials 

Coefficients 

X, 

x2 

x3 

bo 

bn 

b22 

b33 

bn 

b2 

b3 

b,2 

b,3 

b23 

1 

0 

0 

0 

0.6554 

-0.2479 

-0.2479 

-0.2479 

0 

0 

0 

0 

0 

0 

2 

+ 

+ 

+ 

-0.0861 

0.0630 

0.0630 

0.0630 

0.0804 

0.0804 

0.0804 

0.0979 

0.0979 

0.0979 

3 

+ 

- 

+ 

-0.0861 

0.0630 

0.0630 

0.0630 

0.0804 

-0.0804 

0.0804 

-0.0979 

0.0979 

-0.0979 

4 

- 

- 

+ 

-0.0861 

0.0630 

0.0630 

0.0630 

-0.0804 

-0.0804 

0.0804 

0.0979 

-0.0979 

-0.0979 

5 

- 

+ 

+ 

-0.0861 

0.0630 

0.0630 

0.0630 

-0.0804 

0.0804 

0.0804 

-0.0979 

-0.0979 

0.0979 

6 

+ 

+ 

- 

-0.0861 

0.0630 

0.0630 

0.0630 

0.0804 

0.0804 

-0.0804 

0.0979 

-0.0979 

-0.0979 

7 

+ 

- 

- 

-0.0861 

0.0630 

0.0630 

0.0630 

0.0804 

-0.0804 

-0.0804 

-0.0979 

-0.0979 

0.0979 

8 

- 

- 

- 

-0.0861 

0.0630 

0.0630 

0.0630 

-0.0804 

-0.0804 

-0.0804 

0.0979 

0.0979 

0.0979 

9 

- 

+ 

- 

-0.0861 

0.0630 

0.0630 

0.0630 

-0.0804 

0.0804 

-0.0804 

-0.0979 

0.0979 

-0.0979 

10 

+ 

0 

+ 

0.0861 

0.0620 

-0.1880 

0.0620 

0.0446 

0 

0.0446 

0 

0.0542 

0 

11 

0 

- 

+ 

0.0861 

-0.1880 

0.0620 

0.0620 

0 

-0.0446 

0.0446 

0 

0 

-0.0542 

12 

- 

0 

+ 

0.0861 

0.0620 

-01880 

0.0620 

-0.0446 

0 

0.0446 

0 

-0.0542 

0 

13 

+ 

- 

0 

0.0861 

0.0620 

0.0620 

-0.1880 

0.0446 

-0.0446 

0 

0.0542 

0 

0 

14 

- 

- 

0 

0.0861 

0.0620 

0.0620 

-0.1880 

-0.0446 

-0.0446 

0 

0.0542 

0 

0 

15 

- 

+ 

0 

0.0861 

0.0620 

0.0620 

-0.1880 

-0.0446 

0.0446 

0 

-0.0542 

0 

0 

16 

+ 

+ 

0 

0.0861 

0.0620 

0.0620 

-0.1880 

0.0446 

0.0446 

0 

0.0542 

0 

0 

17 

0 

+ 

+ 

0.0861 

-0.1880 

0.0620 

0.0620 

0 

0.0446 

0.0446 

0 

0 

0.0542 

18 

+ 

0 

- 

0.0961 

0.0620 

-0.1880 

0.0620 

0.0446 

0 

-0.0446 

0 

-0.0542 

0 

19 

0 

- 

- 

0.0861 

-0.1880 

0.0620 

0.0620 

0 

-0.0446 

-0.0446 

0 

0 

0.0542 

20 

- 

0 

- 

0.0861 

0.0620 

-0.1880 

0.0620 

-0.0446 

0 

-0.0446 

0 

0.0542 

0 

21 

0 

+ 

- 

0.0861 

-0.1880 

0.0620 

0.0620 

0 

0.0446 

-0.0446 

0 

0 

-0.0542 
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In  practice,  we  often  use  designs  that  are  very  similar  to  D-optimality  designs  in 
their  properties,  but  that  contain  a smaller  number  of  design  points.  Such  designs 
are  known  as  Bk  and  Hartley’s  designs. 

Bk-designs  are  constructed  on  a k-dimension  cube  with  an  equal  number  of 
design  points  in  cube  crowns  and  in  the  center  (k+1)  of  edges  [34]. 

From  the  practical  point  of  view,  design  B4  is  also  interesting  as  it  includes  only  24 
design  points,  i.e.  24  FUFE  design  points  (points  in  hypercube  crowns)  and  additional 
8 design  points  in  starlike  points.  The  matrix  of  such  a design  is  shown  in  Table  2.178. 

This  regression  model  is  obtained  by  processing  experimental  outcomes  [47]: 

y = 4.724  + 0.177XJ  + 0.327X,  - 0.155X,  + 0.151X4 

— 0.577Xi  - 0.114X2  + 0.033X3  - 0.244X4  + 0.211X3X2  (2-124) 

-0.119X3X3  + 0.133XiX4  + 0.142X2X3  + 0.261X2X4  + 0.223X3X4 


Table  2.1 78  B4  design  Table  2.1 79  HA5-design 


No. 

trials 

X, 

x2 

X3 

X4 

yu 

No. 

trials 

X, 

X2 

x3 

X4 

X5 

1 

- 

- 

- 

- 

12.4 

1 

+ 

+ 

+ 

+ 

+ 

2 

- 

- 

- 

+ 

20.7 

2 

- 

+ 

+ 

+ 

- 

3 

- 

- 

+ 

- 

13.1 

3 

+ 

- 

+ 

+ 

- 

4 

- 

- 

+ 

+ 

27.5 

4 

- 

- 

+ 

+ 

+ 

5 

- 

+ 

- 

- 

45.1 

5 

+ 

+ 

- 

+ 

- 

6 

- 

+ 

- 

+ 

60.6 

6 

- 

+ 

- 

+ 

+ 

7 

- 

+ 

+ 

- 

44.8 

7 

+ 

- 

- 

+ 

+ 

8 

- 

+ 

+ 

+ 

58.9 

8 

- 

- 

- 

+ 

- 

9 

+ 

- 

- 

- 

15.0 

9 

+ 

+ 

+ 

- 

- 

10 

+ 

- 

- 

+ 

19.5 

10 

- 

+ 

+ 

- 

+ 

11 

+ 

- 

+ 

- 

10.8 

11 

+ 

- 

+ 

- 

+ 

12 

+ 

- 

+ 

+ 

20.2 

12 

- 

- 

+ 

- 

- 

13 

+ 

+ 

- 

- 

39.2 

13 

+ 

+ 

- 

- 

+ 

14 

+ 

+ 

- 

+ 

56.7 

14 

- 

+ 

- 

- 

- 

15 

+ 

+ 

- 

- 

41.4 

15 

+ 

- 

- 

- 

- 

16 

+ 

+ 

+ 

+ 

63.5 

16 

- 

- 

- 

- 

+ 

17 

+ 

0 

0 

0 

31.3 

17 

+ 

0 

0 

0 

0 

18 

- 

0 

0 

0 

34.9 

18 

- 

0 

0 

0 

0 

19 

0 

+ 

0 

0 

49.9 

19 

0 

+ 

0 

0 

0 

20 

0 

- 

0 

0 

26.3 

20 

0 

- 

0 

0 

0 

21 

0 

0 

+ 

0 

37.9 

21 

0 

0 

+ 

0 

0 

22 

0 

0 

- 

0 

41.3 

22 

0 

0 

- 

0 

0 

23 

0 

0 

0 

+ 

44.6 

23 

0 

0 

0 

+ 

0 

24 

0 

0 

0 

- 

30.8 

24 

0 

0 

0 

- 

0 

25 

0 

0 

0 

0 

+ 

26 

0 

0 

0 

0 

- 

27 

0 

0 

0 

0 

0 
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Hartley's  designs  include  design  points  in  cube  crowns  (regular  fractional  replica), 
as  well  as  “starlike”  and  null  design  points.  Design  HA5  (lc=5)  is  of  particular  impor- 
tance for  being  especially  efficient.  The  total  number  of  design  points  for  such  a 
design  is  not  great  and  it  is  N=27.  Its  matrix  is  given  in  Table  2.179. 

2.3.5 

Conclusion  after  Obtaining  Second-order  Model 

After  obtaining  an  optimal  mathematical  model  of  a research  subject,  a conclusion 
is  brought  down  to  a statement  on  lack  of  fit  or  inadequacy  of  the  obtained  second- 
order  regression  model.  The  significance  of  regression  coefficients  in  second-order 
models  is  marginal  in  comparison  with  their  significance  in  linear  models,  since 
purposes  of  obtaining  linear  and  nonlinear  models  differ. 

As  said  before,  linear  models  are  used  to  reach  (move  towards)  optimum,  so  that 
the  significance  of  regression  coefficients  is  an  assumption  for  successful  applica- 
tion of  the  steepest  ascent  method.  Linear  models,  therefore,  include  as  many  fac- 
tors as  possible,  and  full  factorial  experiments  are  even  replicated  with  increased  fac- 
tor variation  intervals. 

With  nonlinear  models,  which  are  aimed  at  mathematical  modeling  or  adequate 
description  of  the  optimum  region  and  that  as  a rule  have  numerous  regression 
coefficients,  rejection  of  insignificant  regression  coefficients  is  not  so  important  as 
in  the  phase  of  linear  modeling.  For  second-order  models,  an  estimate  of  lack  of  fit 
or  inadequacy  of  the  model  is  of  particular  importance. 

Inadequate  second-order  model 

As  with  an  inadequate  linear  model,  it  is  possible  to  switch  to  a higher  order  or  a 
third-order  model.  Realization,  processing  of  experimental  results  and  analysis  and 
interpretation  are  very  complicated  for  third-order  designs,  which  makes  such  a sug- 
gestion not  efficient  enough  [16]. 

Removing  the  model  inadequacy  by  introducing  rejected  factors  (in  the  phase  of 
screening  experiments  and  linear  model  analysis)  and  by  an  increased  number  of 
trial  replications,  is  much  more  acceptable. 

Adequate  second-oder  model 

When  processing  of  experimental  outcomes  shows  an  adequate  regression  model, 
the  problem  of  mathematical  modeling  of  response  optimum  is  terminated,  since 
an  interpolation  model  of  the  research  subject  has  been  obtained. 

In  the  case  of  an  extreme  experiment,  we  are  faced  with  determining  optimum 
coordinates  from  the  obtained  mathematical  model.  In  that  case,  canonical  analysis 
or  methods  of  nonlinear  programming  are  mostly  used.  The  obtained  optimum 
coordinates  on  a research,  lab  level  are  the  starting  point  for  a switch  from  lab  to 
pilot-plant  or  full-scale  levels3’.  The  procedure  is  in  principle  repeated  in  a larger  sys- 

3)  “Scale-Up”  is  an  Anglo-Saxon  term  for  switch 
from  smaller  to  lager-scale  systems 
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tem  within  the  defined  experimental  center  /48,  18/.  The  obtained  mathematical 
model  and  coordinates  of  the  optimum  are  particularly  useful  for  problems  of  pro- 
cess control  in  a full-scale  plant. 


2.4 

Statistical  Analysis 

2.4.1 

Determination  of  Experimental  Error 

A researcher  who  manages  an  experimental  study  by  application  of  design  of  experi- 
ments, must  have  a clear  idea  on  methods  of  processing  experimental  results  in  the 
actual  case  and  before  starting  it,  so  as  to  facilitate  defining  the  research  objective 
and  the  drawing  of  conclusions.  A mathematical  theory  of  experiments  differenti- 
ates several  types  of  errors  in  processing  experimental  results,  each  of  which  is  char- 
acteristic of  a definite  phase  of  data  analysis.  Each  experiment  consists  of  a certain 
number  of  design  points-trials,  each  design  point-trial  of  one  or  more  replicated  trials, 
and  a single  design  point  of  one  or  more  replicated  measurements  (determinations).  In 
accordance  with  this  we  distinguish  experiment  error  (reproducibility  variance),  trial 
error-variance  of  replicated  trials  and  measurement  error  (determination  error).  Measure- 
ment error  is  local  to  a trial,  and  may  come  either  from  an  instrument  or  from  spec- 
imen differences  (sampling  error)  or  both.  It  can  be  reduced  by  talcing  repeated 
measurements  and  averaging.  Whether  or  not  such  averaging  is  worthwhile, 
depends  on  its  magnitude  with  respect  to  the  trial  error.  The  standard  deviation  of 
measurement  error  may  be  estimated  from  repeated  measurements  within  the 
same  trial.  The  measurement  variance  is  the  square  of  the  standard  deviation  of  the 
measurement  error.  Recognition  of  the  design  point-trial  and  experimental  errors  is 
necessary  for  a verification  of  the  significance  of  regression  coefficients  and  lack  of 
fit  of  a regression  model. 

Measurement-determination  error  has  been  discussed  in  Sect.  2.1.4. 

Trial  error  and  experimental  error  also  belong  to  the  group  of  random  errors  so 
that  in  estimating  their  values  we  use  the  same  approach  as  for  random  measure- 
ment errors.  In  determining  a measurement  error  we  take  into  account  the  number 
of  replicate  measurements  (u);  in  a trial  error  the  number  of  replicate  trials  (n),  and 
in  an  experimental  error  the  number  of  different  trials  (N).  Replication  of  a trial 
must  not  be  mixed  up  with  replication  of  measurements  in  one  trial.  When  deter- 
mining a trial  error,  we  estimate  the  standard  deviation  of  replicated  trials.  This  may 
be  estimated  by  calculating  the  standard  deviation  of  several  trials  whose  control  fac- 
tor settings  are  the  same.  Ideally  one  would  set  up  and  run  the  same  trial  repeatedly. 
The  small  differences  in  a setup  are  an  important  component  of  the  replicate  error. 
The  replicate  error  is  made  up  of  two  parts:  trial  error  and  measurement  error.  The 
replicate  variance  is  just  the  square  of  the  replicate  standard  deviation.  In  the  case  of 
experimental  error  we  estimate  the  variance  of  reproducibility.  Prior  to  calculating 
experimental  error  it  is  necessary  to  check  variance  homogeneity  of  different  trials 
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III 

Mixture  Design  “Composition-Property” 

3.1 

Screening  Design  “Composition-Property” 

To  begin  any  kind  of  experimental  study  one  should  review  the  reference  literature 
and  discuss  problems  with  researchers  from  the  field.  A result  of  the  mentioned 
preliminary  study  of  a research  subject  is  mostly  a long  list  of  potential  factors  of  a 
system.  Modern  strategy  of  experimental  research,  as  has  been  described  in  previous 
chapters,  begins  with  screening  experiments,  their  task  being  to  find  the  most  sig- 
nificant factors  for  the  observed  system.  Screening  experiments  most  frequently 
solve  the  lowest  level  of  a research  objective  i.e.  screening  out  factors  by  the  signifi- 
cance of  their  effects  on  a response  system.  More  complex  research  objectives 
require  a mathematical  model  of  the  researched  phenomenon,  which  is  brought 
down  to  constructing  a regression  model  and  to  an  analysis  of  the  obtained  response 
surface.  Screening  experiments  offer  sufficient  data  for  obtaining  a linear  regression 
model. 

E(y)=Po+PiXi+p2X2+...+PpXp  (3.1) 

Relative  values  of  factor  effects  are  proportional  to  the  values  of  regression  coeffi- 
cients pI;  of  the  associated  coded  factors  X;.  A factor  may  be  statistically  insignificant 
if  the  value  of  the  associated  regression  coefficient  is  small  enough.  Part  of  the  phi- 
losophy of  screening  experiments  from  sect.  2.2  is  transferred  to  selecting  compo- 
nents of  a mixture  or  composition.  The  main  difference  lies  both  in  the  concept  of 
screening  and  in  construction  of  a design  of  experiments  matrix.  The  difference 
exists  in  specific  constraints  on  the  component  ratio  in  the  mixture.  In  mixture 
experiments,  response  (y)  is  a function  of  the  q-components  ratio  only  (X;)  and  not 
of  its  total  quantity.  Constraints  refer  to  the  ratios  of  each  q-component  and  to  the 
sum  of  ratios  of  all  components.  It  is  mathematically  expressed  in  this  way: 

9 

o < X;  < 1.0;  5]  X;  = 1.0  (3.2) 

>=1 

It  is  evident  that  components  levels-ratios  are  not  independent  and  that  the  level 
of  one  component  depends  on  levels  of  q-1  components.  Screening  experiments  are 
recommended  in  situations  when  the  number  of  components  is  q>6.  For  five  or 
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fewer  components,  fitting  it  with  a second-order  model  is  acceptable  [1].  A design 
matrix  for  a five-component  system  and  a second-order  model  usually  contains  20  to 
25  trials-design  points,  which  is  acceptable.  For  a second-order  model  for  six  or 
more  mixture  components,  the  number  of  design  points-trials  of  a design  matrix  is 
large,  especially  if  one  accounts  for  the  fact  that  it  is  uncertain  whether  all  mixture 
components  are  significant  and  whether  all  the  selected  variation  levels  of  compo- 
nent ratios  are  important.  Due  to  the  complexity,  a screening  experiment  with  a 
small  number  of  trials  is  necessary.  Screening  out  of  components  in  the  mixture  is 
brought  down  to  an  analysis  of  the  response  surface  or  to  response  in  all  directions 
of  the  factor  space.  A response-surface  analysis  or  response-value  analysis  means 
determining  directions,  or  axes  of  the  factor  space  where  response  values  are  con- 
stant, close  to  being  constant  or  change  only  a little.  It  is  an  evident  assumption  that 
significant  mixture  components  have  great  linear  effects.  This  is  correct  in  principle, 
but  there  are  exceptions.  It  is  therefore  suggested  [2]  to  include  into  the  screening 
design  of  experiments  at  least  one  trial  in  its  center  in  order  to  catch  nonlinearity, 
which  may  be  present.  Linear  regression  models  by  Scheffe  [3]  have  the  form: 

E(y)=plX1+|32X2(+...+|3pXp  (3.3) 

Constant  |30  is  omitted  due  to  the  limitation  that  the  sum  of  ratios  of  all  compo- 
nents is  equal  to  one.  Screening  out  of  components  is  brought  down  to  finding  com- 
ponents Xi  that  have  no  effects  and/or  have  the  same  effects  on  the  property-response 
of  the  mixture: 

• If  the  value  of  coefficient  p;  is  equal  to  the  mean  of  coefficients  in  the  model, 
then  EpO; 

E^p.-^-ir'EP  (3.4) 

¥j 

• or  y values  do  not  vary  in  any  direction  normal  to  XpO  (simplex  basis).  In 
other  words,  there  is  no  change  of  y in  the  direction  of  the  normal  passing 
through  the  simplex  centroid.  Geometrically,  this  is  a one-dimensional  local 
space  where  Xj=(l-X;)/(q-l)  for  each  j/i.  Value  E;  is  all  over  the  simplex  called 
linear  factor  effect  Xj.  Constant  responses  in  a direction  parallel  to  the  X;-axis 
(in  the  direction  of  X; ) indicate  the  insignificance  of  the  factor-component  X;. 

• When  two  or  more  coefficients  are  equal  (e.g.  Pi=P2=Pe)  then  the  effects  of 
associated  components  are  equal  in  the  experimental  region.  The  associated 
sum  may  be  considered  as  one  component,  which  means  a reduction  in  the 
number  of  basic  components.  This  also  means  that  there  is  no  variation  of  y 
in  the  q-dimensional  local  space  of  simplex,  where  “e”  is  the  number  of  com- 
ponents with  equal  effects. 

Example  3.1 

Examples  of  response  surfaces  for  the  two  mentioned  cases  are  shown  in  Figs.  3.1 
and  3.2. 
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y=80X1  +9OX2+IOOX3 

*1 


The  response  surface  in  Fig,  3.1  is  defined  by  the  regression  mod- 
el y = 80Xx  + 90X2  + IOOX3,  where  we  can  see  that  there  is  no  variation  of  y in 
direction  X2: 

MPt+Ps) /2=90-(80+100) /2=0 

Contour  or  isoresponse  lines  are  parallel  to  the  X2-axis.  The  response  surface  in 
Fig.  3.2  is  defined  by  the  regression  model  y = 90Xx  + 80X2  + 80X3.  It  is  evident 
that  the  regression  coefficients  for  components  X2  and  X3  are  equal  (P2=Pj)  and  that 
there  are  no  response  changes  in  the  directions  of  response  isolines  for 
X2+X3=const. 


Example  3.2 

Regression  coefficients  in  the  regression  model  y = 80X1  + 90X2  + 90X3  + 100X4 
are: 


P2=Pi; 


P2=(Pl+P3+P4)/3;  P3=(Pl+P2+P4)/3 
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Preliminary  analysis  shows  that  in  this  case  the  mixture  has  components  that 
have  the  same  but  statistically  insignificant  effects.  Effects  of  components  X2  and 
X4,  are  significant,  while  effects  of  components  X2  and  X3  are  equal  and  insignifi- 
cant. In  Example  3.5,  where  a mixture  of  eight  components  is  analyzed,  there  exist 
separately  components  with  no  effects  and  components  with  equal  effects.  The 
effect  of  component  X;  may  be  analyzed  more  generally  than  the  definition  given  by 
Eq.  (3.4).  Let  us  observe  a matrix  with  components  that  have  limitations  in  variation: 
(Ka^Xi^bi^l  Variation  effects  of  components  in  such  situations  are  defined  by  this 
expression: 


E:  = R, 


P; -(<?-!)  EP,- 


¥j  J 


(3.5) 


where: 

Rpbi-a;. 

Cox  [3]  has  in  his  work  developed  a different  expression  for  the  effect  of  compo- 
nent Xj  in  a linear  model: 


(3.6) 


It  should  be  noted  that  Cox's  approach  explicitly  defines  the  standard-referential 
mixture  or  standard  referential  composition. 

Hence,  in  expression  (3.6):  s=(si,  s2  ,...,sq  ) is  the  referential  composition.  If  refer- 
ential composition  is  in  the  simplex  centroid,  then  s1=...=s(j=l/q,  so  that  expression 
(3.6)  is  transformed  into  expression  (3.5).  Actually,  all  referential  compositions  lying 
on  an  axis  of  the  i-th  component  (except  those  in  the  vertex  where  spl)  will  have  E; 
effects  as  defined  by  expression  (3.5)  and  will  be  measured  in  the  direction  parallel 
to  the  axis  of  the  i-th  component,  or  normal  to  the  simplex  basis  opposite  of  the  i-th 
vertex.  That  is,  the  effect  of  component  X;  is  measured  normally  to  the  local  space  of 
the  remaining  components.  It  should  be  pointed  out  that  in  the  case  of  variations  of 
component  ratios  with  no  limitations,  the  effect  of  component  X;  is  determined  par- 
allel to  the  Xj-axis,  orthogonally  for  the  local  factor  space  of  the  remaining  compo- 
nents. The  simplex  centroid  has  a unique  meaning  as  it  is  the  point  that  lies  on  all 
the  axes  or  components.  It  should  also  be  noted  that  directions  in  which  effects  are 
determined  do  not  depend  on  component  transformations  into  pseudo-components, 
or  vice  versa.  Referential  compositions  outside  the  simplex  centroid  have  E;  effects 
that  are  determined  in  the  direction  of  the  line  passing  through  the  vertex  of  pure  X; 
but  not  parallel  to  the  i-th  component  or  axis.  Such  a direction,  for  example,  suits 
the  determining  of  the  effect  for  component  X;  that  has  been  physically  added  to  the 
existing  referential  composition,  which  has  fixed  quantities  of  other  components.  In 
Figs.  3.1  and  3.2  coefficient  contrasts  are  absolutely  equal  to  zero,  so  that  there  is  no 
doubt  that  the  response  is  constant  in  the  associated  directions.  This  is  not  the  case 
in  practice,  so  that  a statistical  procedure  is  defined  to  establish  which  of  the  con- 
trasts is  statistically  significantly  different  from  zero.  Contrasts  in  the  matrix  form 
may  be  marked  in  this  way  C(3,  so  that  the  variance  of  linear  contrast  of  regression 
coefficients  is: 
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V(C|3)  = o2C(x'x)  V (3.7) 

where: 

a"  -is  error  variance; 

C-is  coefficient  vector  that  defines  the  contrast; 

X-is  matrix  (nxq)  of  component  ratios; 

a (XX)  -is  the  variance  of  the  covariance  matrix  of  regression  coefficients. 

Example  3.3  [4] 

Vector  C for  contrast  p2-(Pi+(33)/2  has  this  form:  C=(-0.5;  1;  -0.5).  Null  hypothesis 
H0:  cp=0  is  tested  by  Students  t-test: 

t = cp/[a2C(X'x)_1c']°'5  (3.8) 

where  degree  of  freedom  corresponds  to  variance  a2. 

For  each  interesting  contrast,  one  can  do  the  given  test,  and  regression  model  can  be 
reduced  in  accord  with  its  results.  F-test  for  a difference  between  residual  mean  sum  of 
squares  of  a full  and  reduced  regression  model  is  also  possible.  Error  variance  or  repro- 
ducibility variance  may  also  be  determined  as  before,  from  replicate  design  points  or 
from  previous  experiments.  It  should  be  stressed  that  the  analyzed  statistical  testing  is 
secondary  in  screening  experiments  of  the  “mixture”  type,  and  that  the  primary  thing  is 
the  ranking  of  effects  or  component  regression  coefficients. 

3.1.1 

Simplex  Lattice  Screening  Designs 

The  efficiency  of  screening  experiment  designs  depends  on  the  form  of  experi- 
mental domain.  If  this  domain  suits  a total  simplex  (0<X<1;  i=l,  2 q),  then 

a design  of  experiments  with  (2q+l)  design  points-trials  is  recommended.  In  that 
case  design  of  experiments  includes  q-pure  components  (Xpl.O),  of  centroid  sim- 
plex (X;  = q for  all  i=l,  2,...,  q)  and  q-internal  points  with  coordinates: 

[mV1 , (2 qf1 , (q  + l)/2 q, ...,  (2g)“1J . 

It  should  be  said  that  q-responses  of  pure  components  makes  determination  of 
regression  coefficients  of  linear  model  possible,  while  q-internal  and  central  points 
serve  to  estimate  the  nonlinearity  of  the  response  surface.  It  is  useful  to  include  in 
the  mentioned  designs  of  experiments  q-points  of  "null  effects"  in  this  form: 

[(?  — if  \ (?  — 1)  1 ; ---?  0, ...,  (qr  I)”'] 

Null  effects  are  included  in  situations  when  it  is  expected  that  absence  of  a com- 
ponent may  have  a strong  effect  on  the  response  level.  Designs  of  experiments, 
which  include  null  effects  have  (3q+l)  design  points-trials.  Such  designs  belong  to 
simplex  lattice  screening  designs  and  should  not  be  mixed  up  with  scheffe  simplex  lattice 
designs  [5]  and  simplex  centroid  designs  [6],  which  have  considerably  more  trials  (for 
q>5)  and  facilitate  the  fitting  of  more  complicated  models.  As  is  known,  simplex 
screening  designs  contain  four  groups  of  trials,  here  given  in  Table  3.1. 
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Composition 

X;  = 1,  Xj  = 0;  each  j i 
xi  = (q+  = (2£?)  X; each  j^i 

Xi  = q ; each  i 
xi  = 0;Xj  = (q-  l)-1; each  j/i 

Table  3.1  Simplex  screening  designs 


Mark  Name  Number 

A Vertex  q 

B Internal  q 

C Centroid  1 

D null  effects  q 


Name 

Design  matrix 

Code 

Croup 

No.  trials 

Xn 

x2 

xq 

A 

Vertex 

1 

1 

0 

0 

2 

0 

1 

0 

Q 

0 

0 

1 

B 

Internal 

l 

(q+l)/2q 

i/2q 

i/2q 

2 

i/2q 

(q+l)/2q 

i/2q 

Q 

i/2q 

i/2q 

(q+l)/2q 

C 

Centroid 

l 

q'1 

q'1 

q'1 

D 

Null  effects 

l 

0 

(q-i) 1 

(q-i)-1 

2 

(q-if1 

0 

(q-i)-1 

Q 

(q-i)'1 

(q-i) 1 

0 

Graphic  analysis  of  simplex  screening  experiments 

Graphic  analysis  of  simplex  screening  designs  is  brought  down  to  the  graphic  pre- 
sentation of  response  in  the  direction  of  each  axis  or  component.  When  a design  of 
experiments  has  all  four  groups  of  trials,  then  on  each  axis  there  are  four  composi- 
tions-mixtures.  An  example  of  graphic  analysis  of  a three-component  composition 
of  a rocket  propellant  is  given  in  the  work  of  Kurotori  [7]  in  Fig.  3.3. 

The  elasticity  of  a rocket  propellant  was  measured  as  the  system  response.  An 
analysis  of  Fig.  3.3  indicates  that  all  three  components  of  the  composition  have  large 
effects  and  that  the  response  surface  is  greatly  curved  or  nonlinear.  Besides,  the 
greatest  response  values  were  obtained  close  to  the  centroid.  Large  response  values 
also  occurred  for  lower  Xj  levels  and  for  X3  values  between  0.33  and  0.67.  Graphic 
interpretation  of  the  simplex  screening  experiment  may  be  compared  to  the 
response  surface  contour  plots  of  the  same  example,  Fig.  3.4. 
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Xj  components 

0 0.33  0.67  1.0 

0 (^1)  (q-l)/2q  TcT 

Figure  3.3  Three-component  mixture  response 


Figure  3.4  Response  for  a three-component  composition  of 
rocket  propellant 


It  is  evident  that  graphic  interpretation  of  the  simplex  screening  experiment 
offers  a robust  idea  about  the  form  of  the  response  surface.  It  is  therefore  not  recom- 
mended to  replace  the  response  surface  contour  plot  with  a graphic  interpretation  of 
the  simplex  screening  experiment. 

Example  3.4 

A simplex  screening  experiment  was  done  for  selection  of  a ten-component  compo- 
sition of  petrol.  The  outcomes  of  the  experiment  are  given  in  Table  3.2.  All  four 
groups  of  design  points  have  been  included  in  this  ten-component  design  of  experi- 
ments. The  design  included  3x10+1  trials  or  31  mixtures.  Values  of  linear  regression 
coefficients  and  of  effects  are  given  in  Table  3.2.  Graphic  interpretation  of  this  ten- 
component  simplex  screening  experiment  is  shown  in  Fig.  3.5. 
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octane  number 


Figure  3.5  Simplex  screening  diagram  for  a ten-component 
mixture 


Table  3.2  Simplex  screening  experiment  q= 

40 

Components 

Composition 

Regression  coefficients 

No. 

Code 

Null  effects 

Internal 

Vertex 

Pi 

Ei  (Eq.  3-4) 

1 

LSR 

80.9 

74.3 

66.6 

67.5 

-13.1 

2 

REFA 

79.4 

81.4 

81.8 

82.5 

3.6 

3 

REFB 

79.0 

82.2 

84.5 

85.0 

6.4 

4 

LLCC 

79.3 

80.8 

80.9 

81.5 

2.5 

5 

LHCCA 

79.7 

78.3 

76.2 

76.8 

-2.7 

6 

LHCCB 

79.7 

78.5 

76.2 

76.9 

-2.6 

7 

HLCC 

79.3 

80.9 

81.3 

81.9 

2.9 

8 

HHCCA 

79.6 

78.6 

78.1 

78.4 

-0.9 

9 

HHCCB 

79.6 

78.8 

78.3 

78.7 

-0.7 

10 

POLY 

78.7 

82.0 

82.5 

83.3 

4.5 

95%  Confidence  interval 

±1.1 

±1.2 

Response  or  property  of  mixture  in  centroid  is  79.4 


Analysis  of  Fig.  3.5  may  bring  us  to  several  conclusions: 

• Component  No.  1,  has  a great  negative  effect.  Components  No.  5,  6,  8,  9 also 
have  negative  effects,  while  components  2,  3, 4, 7 and  10  have  positive  effects. 

• Components  5 and  6,  8 and  9,  4 and  7 and  2 and  10  behave  similarly  in  mix- 
tures. This  behavior  corresponds  to  the  physical  essence  of  components. 
There  are  only  minor  differences  in  production  of  the  mentioned  compo- 
nents. 
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• Components  2,  4,  7,  8 and  9 have  the  lowest  effects  on  the  octane  number  of 
petrol. 

• The  response  surface  of  the  petrol  octane  number  has  small  nonlinearity. 

• With  an  increase  in  number  of  components  q,  the  centroid  is  shifted  towards 
the  null-effect  points  |xj  = 0;Xj  = (q  — 1)  ,j  =7  ij , and  internal  points 
move  away  from  the  vertex. 

It  should  be  pointed  out  that  to  obtain  a second-order  model  we  have  to  do  66 
trials,  or  10  with  pure  components,  45  of  binary  composition,  10  internal  points  and 
one  centroid  point.  However,  the  21  compositions  out  of  31  mixtures  from  a screen- 
ing experiment  are  simultaneously  a part  of  the  design  of  66  design  points  for  a sec- 
ond-order model.  This  shows  that  even  in  mixture  experiments  we  may  deal  with 
the  principle  of  upgrading/augmenting  a design  of  experiments. 

3.1.2 

Extreme  Vertices  Screening  Designs 

A large  number  of  “mixture”  problems  has  limitations  on  component  ratios: 

0<a;<Xi<bi<l;  i=1.2,...,q  (3.9) 

Apart  from  the  mentioned  limitations,  one  may  even  meet  a multiple  limitation 
of  this  form: 

Cj  < AyXj  + A2jX2  + ...  + AqjXq  < ij  (3.10) 

Limitations  (3.2);  (3.9)  and  (3.10),  geometrically  form  a hyperpolyhedron  in  factor 
space.  The  design  of  experiments  for  a linear  model,  according  to  Eljving  [8]  actually 
contains  a definite  number  of  hyperpolyhedron  vertices.  Due  to  the  complicated  pro- 
cedure and  criteria,  which  are  used  to  define  vertices-design  points,  XVERT  algo- 
rithms [9]  are  nowadays  used  as  software  packages  for  computers.  Similar  to  previous 
chapter,  in  order  to  check  nonlinearity  of  the  response  surface,  a point  in  the  center  of 
experiment  or  a centroid  is  included  in  a design  of  experiments.  The  XVFRTalgorithm 
defines  extreme  vertices  of  a design  of  experiments  matrix  in  this  way: 

a)  Rank  the  components  according  to  the  growth  of  rank  (b;-a;  ):Xi  has  the  low- 
est and  Xq  the  highest  rank; 

b)  Define  a two-level  design  of  experiments  with  upper  and  lower  limits  of  q-1 
components  with  the  lowest  rank; 

c)  Calculate  the  level-ratio  of  the  q-th  component: 

3-1 

Xq  = 1.0  - £ X, 

i— 1 

d)  A level  is  extreme  if  aq<Xq<bq.  For  vertices  outside  limitations  for  Xq  take  the 
value  of  upper  or  lower  limits  or  the  value,  which  is  closer  to  the  calculated 
value. 

e)  For  each  vertex  the  outside  limitations  define  additional  vertices  (max.=q-l) 
with  an  adjustment  of  levels  of  associated  components  Xq.!  by  adding  up  calcu- 
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lated  differences  of  the  value  Xq  and  the  adopted  upper  or  lower  limitation, 
respectively.  Additional  vertices  are  defined  only  for  the  components  whose 
adjusted  levels  remain  inside  component  limitations. 

This  kind  of  algorithm  defines  2q_1  vertices-trials,  whereby  each  vertex  generates 
at  most  q-1  additional  vertices,  so  that,  total  possible  number  of  design  points  is 
2q"1+(q-l)2q'1.  A vertex  is  not  generated  when  levels  of  just  one  component  are  out- 
side limitations.  Repeated  vertices  are  dropped  in  calculations.  A repeated  vertex 
may  appear  only  when  the  last  component  Xq  takes  the  highest  (bq)  or  lowest  (aq) 
level.  Vertices  of  a two-level  design  of  experiments  are  a subgroup  of  extreme  ver- 
tices, obtained  from  the  design  2q  l thus: 

a)  Calculate  Xq  levels  for  all  vertices. 

b)  All  vertices  for  which  aq  <Xq  <bq  , form  the  design  core. 

c)  Each  vertex  for  which  the  Xq  level  is  adjusted  generates  additional  vertices 
that  form  the  candidate  subgroup. 

d)  Design  of  experiment  consists  of  the  core  and  one  vertex  from  each  candidate 
subgroup.  In  a given  subgroup,  the  vertex  that  enters  the  best  design  of 
experiments  is  defined  in  the  way  that  it  represents  all  the  possible  combina- 
tions of  the  core  vertex  and  one  vertex  from  each  subgroup.  If  n;  is  the  num- 
ber of  vertices  in  the  i-th  subgroup,  then  the  number  of  possible  designs  of 

k 

experiments  is  equal  to  ni  where  k-is  number  of  subgroups. 

i—1 


Example  3.5  [9] 

To  begin  with,  we  can  analyze  a three-component  composition  with  these  limitations 
on  components: 

0.1<C1<0.7;brai=0.6;0.0<C2<0.7;b2-a2=0.7;  0.1^C3^0.6j  b3-a3=0.5. 

Ranking  or  sequence  of  components  due  to  growth  of  rank  is  shown  in  Table  3.3. 

Table  3.3  Components  with  limitations 


Components 

Minimum 

ai 

Maximum 

t>i 

Rank 

Mi 

x,=c, 

0.1 

0.6 

0.5 

X2=c, 

0.1 

07 

0.6 

X3=C2 

0.0 

07 

07 

Vertices  of  design  core  correspond  to  2q  1 = 23  1 = 22  full  factorial  design  where 
upper  and  lower  levels  correspond  to  limitations  of  q-1  factors.  Component  levels 
Xq=X3  are  determined  thus: 

4-t 

X3  = 1.0  — X;  = 1.0  — Xj  — X2  Vertices  are  defined  by  these  ratios  of  COmpO- 

fcl 

nents  in  Table  3.4. 
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Vertices  B and  C have  X3  =0.3  and  0.2,  respectively.  These  values  are  within  limita- 
tions 0. 0:0.7,  for  component  of  X3.  However,  X3  component  levels  in  vertices  A and 
D (X3=0.8  and  -0.3)  are  outside  the  limiting  interval.  Vertices  A and  D have  to  be 
adjusted  to  satisfy  limitations  of  X3.  The  component  level  of  X3  for  vertex  A should 
be  reduced  by  0.1  to  obtain  a value  of  0.7,  which  is  the  upper  limit  of  the  value.  The 
component  level  of  X3  for  vertex  D has  to  be  increased  by  0.3  to  obtain  the  value  of 
0.0.  Components  X3  and  X2  for  both  vertices  have  to  be  corrected  to  compensate  for 
the  changes  of  X3.  Results  of  corrections  are  given  in  Table  3.5.  According  to  the 
XVERT  algorithm  we  obtained  two  additional  vertices  for  A and  D,  which  total  six 
extreme  vertices,  as  shown  in  Fig.  3.6. 

Table  3.4  Design  of  experiments  matrix 


Vertex 

x, 

X2 

x3 

A 

0.1 

0.1 

0.8 

B 

0.6 

0.1 

0.3 

C 

0.1 

0.7 

0.2 

D 

0.6 

0.7 

-0.3 

Table  3.5  Corrected  levels 


Vertices 

X, 

x2 

x3 

Correction 

A 

0.1 

0.1 

0.8 

A1 

0.1 

0.2 

0.7 

+0.1 

A2 

0.2 

0.1 

0.7 

+0.1 

D 

0.6 

0.7 

-0.3 

D1 

0.6 

0.4 

0 

-0.3 

D2 

0.3 

0.7 

0 

-0.3 

Table  3.6 

Ratios  of  com 

ponents 

Vertices 

Vertex  mark 

X, 

X2 

x3 

B 

1 

0.6 

0.1 

0.3 

C 

2 

0.1 

0.7 

0.2 

A1 

3 

0.1 

0.2 

0.7 

A2 

4 

0.2 

0.1 

0.7 

D1 

5 

0.6 

0.4 

0.0 

D2 

6 

0.3 

0.7 

0.0 
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Figure  3.6  Three-component  extreme  vertices  screening  experi- 
ment 

To  obtain  a design  of  experiment  with  four  trials-vertices,  two  candidate  sub- 
groups with  two  vertices  each  have  been  generated  for  A and  D vertices,  Table  3.6. 
The  design  of  experiment  consists  of  these  vertices  1,  2,  (3  or  4)  and  (5  or  6).  Out  of 
four  possible  vertex  combinations,  statistically  the  most  efficient  is  the  design  of 
experiment  with  these  vertices-trials  1;  2;  3 and  5. 

Example  3.6  [10] 

A four-component  example  is  the  subject  of  analysis  due  to  a small  number  of 
extreme  vertices  and  the  possibility  to  present  all  the  possible  designs  of  experi- 
ments. Limitations  for  the  four  mentioned  components  are  as  follows: 

0.00<X!  <0.04;  bra^O.04;  0.00<X2<0.10;  b2-a2=0.10; 

0.40<X3<0.55;  b3-a3=0.15;  0.40<X4<0.60;  b4-a4=0.20. 

The  ratios  of  components  both  for  10  extreme  vertices  and  for  designs  of  experi- 
ments n=4  and  8 are  given  in  Tables  3.7  and  3.8.  Levels  or  ratios  of  components  X3, 
X2  and  X3  in  the  design  of  experiment  n=4  are  generated  from  a fractional  design 
23'1  with  generating  ratio  X3=XjX2,  Table  3.9. 

The  calculated  level  values  for  X4  are  within  limitations  for  the  first  three  design 
points.  Design  point  D may  be  adjusted  in  two  ways  (by  reducing  X3  for  0.09  and 
increasing  X4  for  0.09  or  by  reducing  X2  for  0.09  and  increasing  X4  for  0.09),  so  that 
two  designs  of  experiments  are  possible:  design  with  vertices  2,  3,  5 and  8 or  design 
with  vertices  2,  3,  5 and  10.  Levels  or  ratios  of  components  X3,  X2  and  X3  in  a design 
of  experiments  with  8 trials  are  generated  from  FUFE  23.  In  this  approach  two 
design  points  have  X4  outside  limitations.  Each  design  point  may  be  adjusted  in  two 
ways  so  that  2 x 2=4  designs  of  experiments  may  be  formed. 

Table  3.8  shows  that  designs  of  experiments  generated  by  XVERT software  are  sta- 
tistically more  efficient  than  CADEX  designs. 
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Table  3.7  Extreme  vertices 


Vertices 

x, 

x2 

X3 

X4 

1 

0.00 

0.00 

0.40 

0.60 

2 

0.00 

0.10 

0.40 

0.50 

3 

0.04 

0.00 

0.40 

0.56 

4 

0.04 

0.10 

0.40 

0.46 

5 

0.00 

0.00 

0.55 

0.45 

6 

0.04 

0.00 

0.55 

0.41 

7 

0.00 

0.10 

0.50 

0.40 

8 

0.04 

0.10 

0.46 

0.40 

9 

0.00 

0.05 

0.55 

0.40 

10 

0.04 

0.01 

0.55 

0.40 

Table  3.8  Designs  of  experiment 


n 

Algorithm 

Vertices 

G-efficiency 

4 

All  combinations 

1,4, 6, 7 

81 

XVERT 

2, 3, 5, 8 

62 

CADEX 

1,2, 8, 9 

28 

8 

All  combinations 

1,2, 3, 4, 5, 6, 7, 8 

88 

XVERT 

1,2,3,4,5,6,7,8 

88 

CADEX 

1,2,3,5,67,8,9 

72 

CADEX-software  for  generating  designs  [10] 


Example  3.7 

To  do  a selection  of  components  in  a petrol  composition,  define  the  matrix  of  the 
extreme  vertices  screening  design.  The  composition  of  petrol  is  made  of  five  compo- 
nents with  these: 

0.00<X!<0.10;  brai=0.10;  0.00<X2<0.10;  b2-a2=0.10  0.05<X3<0.15; 

b3-a3=0.10;  0.20<X4<0.40;  b4-a4=0.20  0.40<XS<0.60;  bs-as=0.20 

Table  3.9  Design  of  experiment  n=4 


Trials 

Vertex 

x, 

X2 

X3 

x4 

A 

5 

0.0 

0.0 

0.55 

0.45 

B 

3 

0.04 

0.0 

0.40 

0.56 

C 

2 

0.0 

0.10 

0.40 

0.50 

D 

0.04 

0.10 

0.55 

- 

D1 

8 

0.04 

0.10 

0.46 

0.40 

D2 

10 

0.04 

0.01 

0.55 

0.40 
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Earlier  research  proved  that  the  linear  regression  model  adequately  describes  the 
octane  number  as  a function  of  component  ratio.  According  to  XVERT  algorithm, 
design  core  and  candidate  subgroups  are  shown  in  Table  3.10. 

It  is  interesting  to  note  that  the  XVERT  algorithm  includes  in  a design  of  experi- 
ment the  first  two  points  from  candidate  groups.  The  linear  regression  model  has 
this  form: 

y = 102.4X,  + 100. 7X2  + 85.2X3  + 84.7X4  + 97.6X5 

To  use  designs  that  generate  the  XVERT algorithm  in  practice,  consult  Table  3.11. 


Table  3.10  Design  of  experiment  q=5 


Vertices  groups 

Vertices 

Xr 

x2 

X3 

X4 

x5 

y 

Design  core 

1 

0.10 

0.10 

0.05 

0.20 

0.55 

95.1 

2 

0.10 

0.00 

0.15 

0.20 

0.55 

93.4 

3 

0.00 

0.10 

0.15 

0.20 

0.55 

93.3 

4 

0.10 

0.10 

0.15 

0.20 

0.45 

94.1 

5 

0.00 

0.00 

0.05 

0.40 

0.55 

91.8 

6 

0.10 

0.00 

0.05 

0.40 

0.45 

91.8 

7 

0.00 

0.10 

0.05 

0.40 

0.45 

92.5 

8 

0.00 

0.00 

0.15 

0.40 

0.45 

90.5 

9 

0.00 

0.00 

0.05 

0.35 

0.60 

92.7 

10 

0.10 

0.10 

0.15 

0.25 

0.40 

93.5 

Candidate  su 

bgroups 

1 

11 

0.10 

0.00 

0.05 

0.25 

0.60 

94.8 

12 

0.10 

0.00 

0.10 

0.20 

0.60 

- 

13 

0.10 

0.05 

0.05 

0.20 

0.60 

- 

2 

14 

0.00 

0.10 

0.05 

0.25 

0.60 

93.7 

15 

0.00 

0.10 

0.10 

0.20 

0.60 

- 

16 

0.05 

0.10 

0.05 

0.20 

0.60 

- 

3 

17 

0.00 

0.00 

0.15 

0.25 

0.60 

92.5 

18 

0.00 

0.05 

0.15 

0.20 

0.60 

- 

19 

0.05 

0.00 

0.15 

0.20 

0.60 

- 

4 

20 

0.10 

0.10 

0.05 

0.35 

0.40 

93.1 

21 

0.10 

0.05 

0.05 

0.40 

0.40 

- 

22 

0.05 

0.10 

0.05 

0.40 

0.40 

- 

5 

23 

0.10 

0.00 

0.15 

0.35 

0.40 

91.8 

24 

0.10 

0.00 

0.10 

0.40 

0.40 

- 

25 

0.05 

0.00 

0.15 

0.40 

0.40 

- 

6 

26 

0.00 

0.10 

0.15 

0.35 

0.40 

91.6 

27 

0.00 

0.10 

0.10 

0.40 

0.40 

- 

28 

0.00 

0.05 

0.15 

0.40 

0.40 

- 
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Table  3.11  Design  of  experiments  by  XVERT  algorithm 


Mark 

q 

N 

n G-efficiency 

A 

4 

10 

4 

62 

8 

84 

B 

5 

20 

8 

84 

11 

77 

16 

88 

C 

5 

28 

8 

83 

12 

71 

12 

63 

16 

95 

D 

6 

49 

8 

78 

12 

79 

16 

82 

16 

79 

20 

77 

E 

6 

60 

7 

84 

7 

84 

11 

79 

19 

78 

19 

78 

F 

6 

36 

9 

59 

9 

60 

11 

51 

11 

52 

14 

77 

14 

64 

G 

7 

109 

8 

76 

12 

70 

12 

77 

16 

69 

16 

71 

Example  3.8 

In  developing  a new  eight-component,  product  criteria  of  the  economy  demand  that 
the  composition  should  include  these  four  components:  Xj,  X2,  X5  and  X6.  The 
effects  of  the  other  four  components  are  unknown.  Limitations  of  components  are 
these: 

0.10<X!<0.45;  0.05<X2<0.50;  0.00<X3<0.10;  0<X4<0.10; 

0.10<X5<0.60;  0.05<X6<0.20;  0.00<X7<0.05;  0<X8<0.05. 

The  design  of  the  experiment  with  outcomes  is  given  in  Table  3.12. 
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Table  3.12  Extreme  vertices  screening  design 


Trials 

X3 

x2 

X3 

x4 

X5 

X6 

x7 

x8 

y 

1 

0.100 

0.50 

0.00 

0.00 

0.10 

0.20 

0.05 

0.05 

30 

2 

0.100 

0.05 

0.00 

0.00 

0.55 

0.20 

0.05 

0.05 

113 

3 

0.100 

0.50 

0.00 

0.10 

0.10 

0.20 

0.00 

0.00 

17 

4 

0.150 

0.05 

0.00 

0.10 

0.60 

0.05 

0.05 

0.00 

94 

5 

0.100 

0.05 

0.10 

0.00 

0.55 

0.20 

0.00 

0.00 

89 

6 

0.100 

0.50 

0.10 

0.10 

0.10 

0.05 

0.00 

0.05 

18 

7 

0.100 

0.05 

0.10 

0.10 

0.55 

0.05 

0.00 

0.05 

90 

8 

0.400 

0.05 

0.10 

0.10 

0.10 

0.20 

0.05 

0.00 

20 

9 

0.350 

0.05 

0.10 

0.10 

0.10 

0.20 

0.05 

0.05 

21 

10 

0.300 

0.50 

0.00 

0.00 

0.10 

0.05 

0.00 

0.05 

15 

11 

0.100 

0.50 

0.10 

0.00 

0.20 

0.05 

0.05 

0.00 

28 

12 

0.450 

0.05 

0.00 

0.00 

0.45 

0.05 

0.00 

0.00 

48 

13 

0.450 

0.20 

0.00 

0.10 

0.10 

0.05 

0.05 

0.05 

18 

14 

0.450 

0.15 

0.00 

0.10 

0.10 

0.20 

0.00 

0.00 

7 

15 

0.450 

0.25 

0.10 

0.00 

0.10 

0.05 

0.05 

0.00 

16 

16 

0.450 

0.10 

0.10 

0.00 

0.10 

0.20 

0.00 

0.05 

19 

17 

0.259 

0.222 

0.05 

0.05 

0.244 

0.125 

0.025 

0.025 

38 

18 

0.259 

0.222 

0.05 

0.05 

0.244 

0.125 

0.025 

0.025 

30 

19 

0.259 

0.222 

0.05 

0.05 

0.244 

0.125 

0.025 

0.025 

35 

20 

0.259 

0.222 

0.05 

0.05 

0.244 

0.125 

0.025 

0.025 

40 

* chosen 

conditions 

for  central  points  17- 

-20  are 

average  levels  of  the 

previous  16  design  points 


Regression  coefficients  of  linear  model  have  these  values: 

Pi — 33.3;  p2=-10.3;  p3 — 2.7;  p4 — 19.7;  (3S=150.4;  |36=-46.6;  |37=165.5;  pg=188.6. 

It  is  evident  that  the  effects  of  components  X3,  X2,  X3,  X4  and  Xs  are  negative,  and 
those  of  components  X5,  X7  and  X8  positive.  By  their  size,  the  effects  of  components 
have  this  sequence:  X5,  Xj,  X2,  X4,  X8.  Components  X3,  X6  and  X7  have  small  effects. 
It  has  been  said  that  for  reasons  of  economy,  the  composition  of  the  product  should 
include  components  X3,  X2,  X5  and  X6.  Since  by  the  definition  of  a research  objec- 
tive, one  should  increase  response  y,  it  is  necessary  to  exclude  components  X3  and 
X4  from  the  composition  and  keep  components  X3,  X2  on  lower  levels.  An  analysis 
of  regression  coefficients  shows  that  one  can  establish  these  approximate  relations: 

Pl=P4l  P2=P3i  Ps=P7=P8 

Such  approximate  relations  suggest  modeling  the  system  as  a three-component 
mixture. 

Zj-jXi+jy/a-X*); 

Z2=(X2+X3)/(1-X6); 

Z3=(X5+X7+X8)/(1-XS). 
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3.2 

Simplex  Lattice  Design 

When  studying  the  properties  of  a q-component  mixture,  which  are  dependent  on 
the  component  ratio  only,  the  factor  space  is  a regular,  (q-1)  simplex,  and  for  the 
mixture  the  relationship  holds 
q 

E^  = l;0<Xi<l  (3.10a) 

i—1 

where 

Xj-is  the  component  concentration-ratio-productions,  and 
q-is  the  number  of  components  in  the  mixture. 

For  binary  systems,  (q=2)  the  simplex  of  dimension  1 is  a straight  line-segment.  For 
c^3,  the  regular  2-simplex  is  an  equilateral  triangle  with  its  interior.  Each  point  in  the 
triangle  corresponds  to  a certain  composition  of  the  ternary  system,  and  conversely  each 
composition  is  presented  by  one  distinct  point.  The  composition  may  be  expressed  as  a 
molar,  weight,  or  volume  fraction  or  a percentage.  Vertices  of  the  triangle  represent  pure 
substances,  and  the  sides  represent  binary  mixtures.  If  we  draw  an  altitude  from  each 
vertex,  then  dissect  each  altitude  into  ten  equal  segments  and  draw  through  the  division 
points  straight  lines  parallel  to  the  triangle  sides,  we  shall  have  a triangular  network-sim- 
plex lattice.  Approaching  from  a side  to  the  opposite  vertex  corresponds  to  the  propor- 
tional increasing  in  content  of  the  “vertex”  component,  therefore  the  sequential  transfer 
from  one  parallel  line  of  two  components  to  another  signifies  the  growth  of  a third  com- 
ponent by  10  per  cent,  Fig.  3.7.  In  actual  practice  though,  no  altitudes  are  drawn,  instead 
the  component  contents  are  marked  off  directly  on  the  triangle  sides.  Such  a method  of 
counting  is  adopted  in  Gibbs’s  triangle.  In  Rozebum’s  triangle,  the  composition  of  a tern- 
ary system  is  read  from  three  segments  of  one,  Fig.  3.8. 


Figure  3.7  Gibbs’s  concentration  triangle 
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Figure  3.8  Rozebum’s  concentration  triangle 


In  the  concentration  triangle,  points  lying  on  a straight  line  originating  from  a 
vertex  correspond  to  mixtures  with  a constant  ratio  of  components  represented  by 
the  other  two  vertices.  The  property  (y)  is  normally  thought  of  as  projections  of  lines 
of  constant  value  on  the  plane  of  the  concentration  triangle. 

At  q=4,  the  regular  simplex  is  a tetrahedron  where  each  vertex  represents  a 
straight  component,  an  edge  represents  a binary  system,  and  a face  a ternary  one. 
Points  inside  the  tetrahedron  correspond  to  quaternary  systems. 

It  is  known  that  to  construct  multicomponent  mixture  diagrams  we  have,  as  a 
rule,  to  do  a large  number  of  experiments.  Scheffe  [5]  suggested  in  1958  that  to  solve 
such  problems  we  should  use  some  properties  of  geometric  figures  that  are  gener- 
ally used,  as  demonstrated,  to  illustrate  mixture  compositions.  Kurnakov  [11]  has 
also  demonstrated  that  a composition  of  a q-dimensional  mixture  may  be  given  by  a 
q-1  dimensional  simplex.  It  is  also  known  [11]  that  to  each  phase  or  a complex  of 
phases,  which  are  in  balance  in  such  a system,  corresponds  a definite  geometric 
interpretation  or  function  (principle  of  coincidence)  whereby  the  function  is  contin- 
uous (principle  of  continuity).  It  is  also  clear  that  any  continuous  function  may  be 
approximated  or  developed  into  Taylor’s  order.  Hence,  a change  in  property  of  a mix- 
ture may  be  expressed  by  a polynomial  of  definite  degree,  by  way  of  independent 
variables  Xx,  X2,...,Xq  where  X;  is  the  ratio  of  the  i-th  component  in  the  mixture.  A 
polynomial  of  n-th  degree  with  q variables  has  ( q+n ) coefficients: 

y = b0+  E biXi+  E bjXiXj+  E bijtXiXjXk  + ... 

l<i<q  1 <i<j<q  1 <i<j<k<q 

+ X^il,i2 inXilXi2-Xin 


(3.11) 
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Scheffe  [5]  suggested  to  describe  mixture  properties  by  reduced  polynomials  obtain- 
able from  Eq.  (3.11),  which  is  subject  to  the  normalization  condition  of  Eq.  (3.2)  for 
a sum  of  independent  variables.  We  shall  demonstrate  below  how,  for  instance,  such 
a reduced  second-degree  polynomial  is  derived  for  a ternary  system.  The  polynomial 
has  the  general  form: 

y = b0  + b3X3  + b2X2  + b3X3  + b32X3X2  + b33X3X3  + -^2-^3  T b^Xi  + b22X2 


+^33X3  (3.12) 

Since: 

X!+X2+X3=l  (3.13) 

We  have: 

boX1+boX2+boX3=bo  (3.14) 

Multiplying  Eq.  (3.13)  by  X3,  X2  and  X3  in  succession  gives: 

Xj  = X!  - XjX2  - XjX3 

X2  = X2  - X3X2  - X2X3  (3.15) 

X3  = X3  - X3X3  - X2X3 


Substituting  Eqs.  (3.14)  and  (3.15)  into  (3.12),  we  obtain  after  necessary  transfor- 
mations: 

y = (b0  + bl  + bu)X1  + (b0  + b2  + b22)X2  + ( b0  + b3  + b33)X3  ^ 

+(^i2  — bu  — b22)X1X2  + +(b13  — bn  — b33)X1X3  + (b23  — b22  — b33)X2X3 
We  denote 

P i = b0+bi  + bu ; p..  = btj  - bu  - ^ (3.17) 

Then  we  arrive  at  the  reduced  second-degree  polynomial  in  three  variables: 

y = P3Xi  + P2X2  + P3X3  + P12X3X2  + P13X3X3  + P23X2X3  (3.18) 

Thus  the  number  of  coefficients  has  decreased  from  ten  to  six.  In  a general  case 
for  a q-component  system  different  degrees  of  regression  models  have  these  forms: 


a) 

Linear  regression  model: 

?=  E P,x; 

1 <i<q 

(3.19) 

b) 

Square  regression  model: 

?=  E P,x;+  E P 

l<i<q  1 <H/— 9 

(3.20) 

c) 

Incomplete  cube  regression  model: 

y=  E P,X;+  E PHX;Xj+  £ P ijUXiXjXk 

1 <i<q  1 <i~<j~<<q  l<i^j^k<q 

(3.21) 
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d)  Complete  cube  regression  model: 

?=  E px,  + £ iy^i+  E 7;(.x,x.(x.  _ Xj) 

1 <i<q  1 <i^j<q  1 

+ E (\,xixje 

l<i~<j^k<q 

e)  Four-degree  regression  model: 

?=  E P,X;+  E P^X-F  £ yijXiXj(Xl  — X) 


(3.22) 


l<i<g 


J ' ^ ' y 

+ £ 5i).xixJ.(xi-x/+  E P iijkx?XjXk+  E Ppx,x,2xfe 

1 <H/— 9 1 1 <i^j^.k<q 

+ E P^XXjX2  + E P^WfcX,  (3.23) 

<i^J~<k<q  l<i^j~<k^£<q 


3.3 

Scheffe  Simplex  Lattice  Design 

The  most  frequently  used  mixture-"composition-property”  designs  of  experiments 
belong  to  simplex-lattice  designs  suggested  by  Scheffe  [5].  The  basis  of  this  kind  of 
designing  experiments  is  a uniform  scatter  of  experimental  points  on  the  so-called 
simplex  lattice.  Points,  or  design  points  form  a [q,n]  lattice  in  a (q-1)  simplex,  where 
q is  the  number  of  components  in  a composition  and  n is  the  degree  of  a polyno- 
mial. For  each  component  there  exist  (n+1)  similar  levels  Xp0,l/n,2/n,...,l  and  all 
possible  combinations  are  derived  with  such  values  of  component  concentrations. 
So,  for  instance,  for  the  quadratic  lattice  [q,2]  approximating  the  response  surface 
with  second-degree  polynomials  (n=2)  the  following  levels  of  every  factor  must  be 
used:  0,  1/2  and  1;  for  the  cubic  (n=3):  0,  1/3,  2/3  and  1,  etc.  Number  of  design 
points  to  be  performed  for  obtaining  a definite  order  for  a definite  number  of  com- 
ponents is  shown  in  Table  3.13.  Some  of  [3,n]  and  [4,n]  lattices  are  depicted  in  Figs. 
3.9  and  3.10. 

Table  3.13  Number  of  design  points  of  simplex  lattice  designs 


Number  of 
components 

Polynome  degree 

2 

Incomplete 

3 

3 

4 

3 

6 

7 

10 

15 

4 

10 

14 

20 

35 

5 

15 

25 

35 

70 

6 

21 

41 

56 

126 

8 

36 

92 

120 

330 

10 

55 

175 

220 

715 
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Figure  3.9  {3,n}-lattices:  a)  for  a second-degree  polynomial,  b) 
for  an  incomplete-third-degree  polynomial,  c)  for  a third-degree 
polynomial,  d)  for  a fourth-degree  polynomial 


When  using  simplex  lattice  designs,  we  apply  the  principle  of  upgrading/aug- 
menting, which  was  analyzed  in  previous  chapters.  An  increase  in  polynomial 
degree  is  achieved  in  a standard  way  after  checking  lack  of  fit  of  the  obtained  regres- 
sion model.  The  rule  is  known:  the  higher  the  polynomial  degree,  the  greater  the 
number  of  design  points  in  a design  of  experiments  matrix.  The  required  number 
of  design  points  is  calculated  from  this  expression: 


(Af+g-1)! 
«!(<?- 1) 


(3.24) 


where: 

n-is  polynomial  degree, 
q-is  number  of  components. 

Results  of  such  calculations  are  given  in  Table  3.13.  To  obtain  a linear  regression 
model,  one  has  to  do  three  design  points  in  simplex  vertices.  To  check  lack  of  fit  of 
the  model,  one  has  to  do  an  additional  design  point  in  the  simplex  center.  The 
design  of  the  experiment  is  given  in  Table  3.14. 
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Figure  3.10  {4,n}-lattices:  a)  for  a second-degree  polynomial,  b)  for  an 
incomplete-third-degree  polynomial,  c)  for  a third-degree  polynomial, 
d)  for  a fourth-degree  polynomial 


Table  3.14  Simplex  lattice  design  {3,1} 


N 

Xn 

x2 

X3 

y 

1 

1 

0 

0 

Yi 

2 

0 

1 

0 

Yi 

3 

0 

0 

1 

Yi 

4* 

1/3 

1/3 

1/3 

Yin 

^'-Control  point 

As  one  may  conclude  from  the  above  table,  subscripts  of  the  mixture  property- 
response  symbols  indicate  the  relative  proportion  of  each  of  the  components  in  the 
mixture.  For  example,  the  mixture  1 (Table  3.14)  contains  the  component  X1  alone, 


3.3  Scheffe  Simplex  Lattice  Design  | 487 

the  property-response  of  this  mixture  is  denoted  by  y3,  mixture  4 includes  (1/3)X1; 
(1/3)X2  and  (1/3)X3,  the  property  being  designated  as  y123.  The  design  matrix  for  the 
simplex  lattice  {3.3}  is  shown  in  the  tables  below. 


Table  3.15  Simplex  lattice  design  {3;2} 


No.  oftrials-N 

X, 

X2 

X, 

y 

1 

1 

0 

0 

yi 

2 

0 

1 

0 

y 2 

3 

0 

0 

1 

y3 

4 

0.5 

0.5 

0 

in 

5 

0.5 

0 

0.5 

yn 

6 

0 

0.5 

0.5 

y23 

7* 

1/3 

1/3 

1/3 

yi23 

Table  3.16 

Simplex  lattice  design 

{3;3} 

No.  oftrials-N  X3 

x2 

X3 

y 

1 

i 

0 

0 

yi 

2 

0 

1 

0 

y2 

3 

0 

0 

1 

y3 

4 

2/3 

1/3 

0 

ym 

5 

1/3 

2/3 

0 

yi22 

6 

0 

2/3 

1/3 

y223 

7 

0 

1/3 

2/3 

y233 

8 

2/3 

0 

1/3 

yiu 

9 

1/3 

0 

2/3 

yo3 

10 

1/3 

1/3 

1/3 

yi23 

The  number  of  trials  of  simplex  lattice  designs,  which  depend  on  the  number  of 
components  and  the  degree  of  regression  model,  is  given  Table  3.18. 
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Table  3.17  Simplex  lattice  design  {3;4} 


No.  trials 

x. 

X2 

x3 

y 

1 

1 

0 

0 

Yi 

2 

0 

1 

0 

Yi 

3 

0 

0 

1 

Yi 

4 

1/2 

1/2 

0 

Yl2 

5 

1/2 

0 

1/2 

Yu 

6 

0 

1/2 

1/2 

Yn 

7 

3/4 

1/4 

0 

Ynu 

8 

1/4 

3/4 

0 

Y\222 

9 

3/4 

0 

1/4 

ynu 

10 

1/4 

0 

3/4 

Y1333 

11 

0 

3/4 

1/4 

Y2223 

12 

0 

1/4 

3/4 

Y2333 

13 

1/2 

1/4 

1/4 

yim 

14 

1/4 

1/2 

1/4 

yi223 

15 

1/4 

1/4 

1/2 

yi233 

16* 

1/3 

1/3 

1/3 

yi23 

Table  3.18  Number  of  design  points  of  simplex  lattice  designs 


Regression  model  Response 

subscripts 

Number  of  components  in  mixture 

3 

4 

5 

6 

7 

Second  degree 

i 

3 

4 

5 

6 

7 

ij 

3 

6 

10 

15 

21 

Incomplete  third  degree 

i 

3 

4 

5 

6 

7 

ij 

3 

6 

10 

15 

21 

ijk 

1 

4 

10 

20 

35 

Third  degree 

i 

3 

4 

5 

6 

7 

iij 

6 

12 

20 

30 

42 

1 

4 

10 

20 

35 

Fourth 

i 

3 

4 

5 

6 

7 

ij 

3 

6 

10 

15 

21 

iiij 

6 

12 

20 

30 

42 

ijkl 

- 

1 

5 

15 

35 

iijk 

3 

12 

30 

60 

105 

Coefficients  of  regression  equations  are  determined  from  experimental  outcomes 
in  very  simple  relations.  For  example,  a second  degree  regression  model  (3.18)  for 
the  case  X3=l;  X2=0  and  X3=0  is  brought  down  to  yi=Pi-  We  get  y2=|32  and  y3=P3  in 
the  same  way.  For  X3=l/2;  X2=l/2  and  X3=0  we  get: 
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Yu  = V2!3!  + I/2 P2  + l/4p12  =>  p12  = 4y12  - 2yx  - 2y2;  PB  = 4y13  - 2yx  - 2y3; 

P23  = 4Yn  - 2 y2  - 2y3 

In  a general  case  for  a q-component  system,  second-order  regression  coefficients 
(3.20)  are  calculated  by  these  relations: 

PpY;  ; Pij=4yij-2yi-2yj 

Regression  coefficients  in  other  regression  models  are  determined  in  a similar 
way: 

a)  Linear  regression  model 

y = E P,E 

l<i<q 

P.  = y,  (3.25) 


b)  Second-degree  regression  model 

?=  E P iXi+  E P yXiXj 

1 <i<q  1 <i~<j<q 

P i=Yi 

PM=4yy-2y1-2yJ 

c)  Incomplete  third-degree  regression  model 

?=  E P,E  + E P^+  E P^Wt 

l<i<g  1 <i-<j<q  l<i^j^k<q 

P i=Vi 

PM=4yM-2y1-2yJ 

Pyfe  = 27yijk  - 12(y,j-  + ya  + yjt)  + 3(y;  + y,-  + yk) 

d)  Complete  third-degree  regression  model 


(3.26) 


(3.27) 


?=  E P,X;+  E P^X,.+  E Yijxixj(xi-xj)+  E P^E^jE 

l<i<q  1 <H/— 9 1 l<i— 

P;  = yi 

P#  = 9/4(yiS  + yw  - y,  - yj) 

v,  9 4 (lye.  % y;  • y;)  <3'28) 

Pyfe  = 27yyfc  - 27/4(yii#  + y,s  + yiik  + yikk  + y#  + yjja)  + 9/2  (y;  + y + yfe) 
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e)  Fourth-degree  regression  model 

?=  E P,E  + E P gXJxi+  E y^Yx.-x,.) 

l<i<g  1 l<i^j<q 

+ E 6 ijXXjU-xY  + e P Bktfxixk+  E P#E^ 

l<i^,j-<k<q  1 <i<j<k<q 

+ E P#^+  E P^EEEE  (3.29) 

l<i~(j~<k<q  l<i^j^k~<£<q 

P j = yf 

P9=4y9-2yi-2yJ 

Y,j-  = (8/3)  (-ft  + 2yUy  - 2yfiff  + yj 

bij  = (8/3)  (-ft  + 4-Yuij  - 6y,j  + %;  - ft) 

Pfyjk  = 32(3yil}.fc  - y#  - y^)  + (8/3)  (6y;  - ft  ~ yfc)  - 16  (ft,-  + ya) 
f 1 6 / 3 ) ^ 5y  — + 5yiia  3 y — — ^Yikkk  ftyfc  ftfcfcfc) 

P ijjk  = 32(3 y#  - Yajk  - Yijkk)  + (8/3)  (6yj  - ft  - yk)  - 16  (ft,.  + ft*) 

-(16/3)  (5yl2).  + 5ym  - 3 y^  - 3yjm  - yiift  - yim) 

P # = 32  (3ft,**  - yUJk  - y#)  + (8/3)  (6yk  - ft  - ft)  - 16  (ya  + y,*) 

-(16/3)  {Syikkk  + 5 yjkkk  - 3ym  - 3 ym  - y, ^ - yiM) 

P iju  = 256ft/w  _ 32  (yUjk  + y^  + yiiK  + y^  + ftp  + ft,w  + ft,**  + ft***  + yjW  + y,j« 
+Yjkke  + Ym)  + (32/3)  (ymj  + yiiik  + ym  + ftp  + ftp  + ftp  + yiffi  + ft*** 
+Yfcjt«  + Ym  + Yjm  + Ym«)  (3-30) 

After  obtaining  a suitable  regression  model,  a check  of  lack  of  fit  follows  accord- 
ing to  the  mathematical  theory  of  experiments.  Since  simplex-lattice  designs  are  sat- 
urated, we  need  degrees  of  freedom  to  check  lack  of  fit.  To  overcome  the  given  prob- 
lem, we  add  up  additional  design  points,  so-called  control  points,  to  simplex  lattice 
design,  where  lack  of  fit  of  the  model  is  checked.  The  number  of  control  points  and 
domain  of  factor  space  where  those  are  placed  depend  on  experimental  situation, 
complexity  of  trials,  price,  etc.  The  recommendation  is  to  do  control  points  in: 

• the  domain  of  the  studied  mixture  diagram  that  is  interesting  for  the 
researcher; 

• the  existing  points; 

• the  points  that  are  used  for  obtaining  a higher-order  model. 
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The  confidence  of  the  response  estimate  by  a regression  model  differs  in  different 
simplex  points.  Let  the  variance  of  response  estimate  be  Sy , and  let  the  variance  of 
replicated  design  points  be  Sy2.  An  example  of  determining  a variance  of  response 
estimate  is  demonstrated  on  a second-order  regression  model  for  a three-component 
mixture. 

In  our  reasoning  we  assume  that  (1)  Xj  can  be  observed  without  errors,  (2)  the 
replication  variance  is  similar  at  all  the  design  points,  and  (3)  response  values  are 
the  averages  of  n;  and  m replicate  observations  at  appropriate  points  of  the  simplex. 
Then  the  variances  ofyi  and  y;j  will  be: 

4,  = s*/ni 

S2y  = S2y/nij  (3.31) 

In  the  reduced  second-order  polynomial 

y=PlXl+P2X2+P3X3+P12X1X2+P13X1X3+|323X2X3 

we  replace  coefficients  by  their  expressions  in  terms  of  responses: 

P(=y(;P9  = 4y9-2y,-2yj 
We  then  obtain 

? = Yix l + hx2  + YiXi  + (4Yn  - 2Yi  ~ 2 y2)XxX2  + (4y13  - 2 - 2y3)X1X3 

+(4y23  - 2y2  - 2y3)X2X3  = - 2X3X2  - 2X3X3)  + y2(X2  - 2X3X2  - 2X2X3) 

+y3  (X3  - 2XiX3  - 2X2X3)  + 4yi2XiX2  + 4yi3XiX3  + 4y23X2X3  (3.32) 


Using  the  condition:  X3+X2+X3=l  we  transform  the  coefficients  at  y;: 

X3  - 2XjX2  - 2XjX3  = Xj  - 2X3  (X2  + X3 ) =Xi  — 2Xi(l  - Xj)  = Xj(2Xi  - 1), 
etc.  (3.33) 

so  that: 

y = Xi(2Xi  - l)yx  + X2  (2X2  - l)y2  + X3  (2X3  - l)y3  + 4XiX2yi2  + 4XiX3yi3 
+4X2X3y23  ( 3.34) 


Introducing  the  designation 
ai  = X;(2X;  — 1);  fl^  = 4X;Xj 

and  using  Eqs.  (3.30)  and  (3.31),  gives  the  expression  for  the  variances^ 


si.  = Sy 

Y r 


E — E — 

1 <i<q  Hi  1 <i<j<q  Hij 


(3.35) 


(3.36) 


For  incomplete-third,  third-  and  fourth-degree  polynomials  the  relationships  are 
derived  in  much  the  same  way.  So,  for  the  incomplete-third-degree  polynomial: 


si.  = Sy 

y r 


E — E JL+  E 


1 <i<q  ni  1 <i^.j<q  Hij 


l<i~tj^k<q  nijk 


(3.37) 
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where: 

bi=jX,(  6xf-2Xi  + l)-3£X,2 

z v 7 j=l 

b.  4X:Xj(iX,  • 3X,  2.) 

bijk  = 27XiXjXk 

For  the  third-degree  polynomial: 

4 = s'f^-+  E — + E — + E ^ 

\l<i<qHi  l<i~(j<q  niij  1 <H<qnijj  l<iXj<k<qHijk  J 

where: 

Cj=^Xj(3Xi-l)(3Xi-2) 

% = ^(3X,  - 1) 

% = ^'(3X,-1) 

c. ,  = 27XiXjXk 

For  the  fourth-degree  polynomial: 

4 = ^f  E E E E A 

' \1  <i<q  Hi  1 <H<q  Hij  1 <Hj<q  W«ij  Hijjj 

j2  ,2  j2  r2  \ 

| iijk  ^ y~r  ijjk  ^ yr  ijkk  ^ y~r  ijk£  j 

1 <ixjxk<q  ^itjk  1 <ixjxk<q  nijjk  1 <ixj~ik<q  n ijkk  1 <ixj<k<y<q  ^ijk£  ) 

where: 

di  = l/6Xi(4Xi-l)(4Xi-2)(4Xi-3) 
di).=4XiXj(4Xi-l)(4XJ.-l) 
dm=i/ZXiXj  (4^.-l)  (4X^-2) 

d«9  = 8/3XiX,.(4Xi  - 1)(4X;  - 2)  (3.47) 

= 32XiXjXk  (4X;  - 1) 

^fc  = 32XiXiXfe(4Xj-l) 
d^  = 32XiXJ.Xfc(4X,-l) 


If  the  number  of  replications  at  all  the  points  of  the  design  is  equal,  i.e.  nj 

then  all  the  relations  for  S—  will  take  the  form: 

y 

si  =sl  xl 

y n 

where  for  the  second-degree  polynomial: 


(3.38) 

(3.39) 

(3.40) 

(3.41) 

(3.42) 

(3.43) 

(3.44) 

(3.45) 


(3.46) 


(3.53) 

mrn, 

(3.54) 


I = E 4 + E 4 

1 <i<q  1 <i~<j<q 

for  the  incomplete-third-degree  polynomial: 

1 = E h + E 4 + E byk 

l<i<q  1 <i^j<q  l<i^j^k<q 

for  the  third-degree  polynomial: 

i — E c;  + E caj  + E cijj  + E ci/'fe  + E c>/ 
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(3.55)  ' 


l<i<j-<k<q 


and  for  the  fourth-degree  polynomial: 


? — E & + E & j + E 


1 + E dijjj  + E dujk  + + E dm 


l<i~<j~<k<q 


l<i~<j~<k<q 


+ E 4 

l<i^j<k<q 


ijkk  + E 1 

1 <i^j<k^t<q 


As  can  be  seen  from  Eqs.  (3.55)  to  (3.58),  ^ depends  only  on  composition,  and 
geometric  interpretation  for  different  order  models  are  to  be  found  in  the  reference 
literature  [12].  Lack  of  fit  of  models  is  checked  in  each  control  point  by  means  of 
Students-  test: 

= (3.59) 

v 3,  y s,vm. 

where: 

Ay=  \yeksp.  -Ycal]’ 

n-is  number  of  replications; 

Sy2  -is  error  mean  square  of  trial; 

£-is  the  coefficient  that  depends  on  mixture  composition. 

The  tR  statistic  has  the  Student  distribution  and  is  compared  with  the  tabulated 
value  of  ta/t.j, at  a level  of  significance  a,  where  t-  is  the  number  of  control  points 
and  f-is  the  number  of  degrees  of  freedom  for  the  replication  variance.  The  null 
hypothesis  that  the  equation  is  adequate  is  accepted  if  tR<tx  for  all  the  control  points. 
It  should  also  be  noted  that  Eq.  (3.59)  is  valid  for  the  same  number  of  replications  in 
each  simplex  point.  The  confidence  interval  for  the  response  value  is: 

y - Ay  > y < y + Ay  (3.60) 


A y - v*-/  x s7 

where  k-is  the  number  of  polynomial  coefficients  determined. 


. ^ Sv  0.5 

Ay  = wx^x^ 
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Example  3.9  [13] 

The  research  objective  is  to  define  the  composition  of  a three-component  mixture: 
Xrozocerite;  X2-paraffin  and  X3-Vaseline,  so  that  the  melting  point  is  66-68  [°C]  and 
penetration  16-18.  The  mixture  is  obtained  by  melting  and  mixing  up. 

The  experiment  has  been  realized  by  a simplex  lattice  design  matrix  for  the 
fourth-degree  model.  This  model  has  been  chosen,  for  in  case  a lower  model  order 
is  adequate,  the  excessive  points  become  control  points. 

Table  3.19  Simplex  lattice  design  {3,4} 


No. 

Factors 

Response 

Mark 

Responses 

X, 

X2 

x3 

yi 

yi 

Y 

yf 

yi 

y' 

1 

1.0 

0.0 

0.0 

yi 

72.5 

73.0 

72.75 

23 

22 

22.5 

2 

0.0 

1.0 

0.0 

12 

54.0 

54.5 

54.25 

8 

8 

8 

3 

0.0 

0.0 

1.0 

Y3 

57.5 

57.0 

57.25 

156 

156 

156 

4 

0.5 

0.5 

0.0 

111 

65.5 

66.0 

65.75 

12 

12 

12 

5 

0.5 

0.0 

0.5 

y« 

68.5 

68.5 

68.5 

292 

305 

298.5 

6 

0.0 

0.5 

0.5 

y23 

52.5 

52.5 

52.5 

67 

68 

67.5 

7 

0.75 

0.25 

0.0 

yim 

70.0 

70.5 

70.25 

16 

17 

16.5 

8 

0.25 

0.75 

0.0 

Yl222 

56.0 

57.0 

56.5 

11 

10 

10.5 

9 

0.75 

0.0 

0.25 

yni3 

72.0 

71.5 

71.75 

74 

75 

74.5 

10 

0.25 

0.0 

0.75 

yi333 

65.0 

65.5 

65.25 

360 

360 

360 

11 

0.0 

0.75 

0.25 

Y2223 

52.5 

53.0 

52.75 

30 

26 

28 

12 

0.0 

0.25 

0.75 

Y2333 

52.5 

54.0 

53.25 

360 

360 

360 

13 

0.5 

0.25 

0.25 

Yi123 

66.0 

66.0 

66.0 

47 

42 

44.5 

14 

0.25 

0.5 

0.25 

Yl223 

60.0 

60.0 

60.0 

42 

40 

41.0 

15 

0.25 

0.25 

0.5 

Yl233 

59.5 

60.0 

59.75 

144 

146 

145 

Testing  the  component  and  mixture  melting  points  was  done  by  the  ASTM-D-127 
method.  Vaseline  penetration  testing  was  done  by  ASTM-D-937,  and  that  of  ozocer- 
ite, paraffin  and  the  mixture  by  ASTM-D-1321  method.  Calculations  by  Eq.  (3.29) 
offered  these  values  of  regression  coefficients: 

Pi  = yr  = 72.75;  |32  = y2  = 54.25;  P3  = y3  = 57.25; 

P12  = 4Viz  - 2yi  - 2y2  = 90.0;  P13  = 4y13  - 2y3  - 2y3  = 14.0; 

P23  =4V23  - 2y2  - 2y3  = -13.0;  yu  =|(-y3  + 2y1112  - 2y1222  +y2)  = 24.03; 

8 8 

Yi3  = 3 fi  + 2fni!  — 2y1333  + y3)  = —6.68;  y23  = - (— y2  + 2y2223  — 2y2333  + y3)  = 5.54; 

g 

8n  = 3 (-yi  + 4V1112  - 6yi2  + 4Vi222  - y2)  = -38.72; 

g 

813  = 3 (— Vi  +4yni3  -6y13  +4y1333  -y3)  = 18.69; 
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g 

^23  = j (— Y2  + 4y2223  — 6y23  + 4y2333  — y3 ) = —6.68; 

g 

P1123  = ^2(3y1123  — yi223  ~ yi233)  + ^ (6yi  — y2  — y3 ) ~ i6(yi2  + yn) 

16 

^-(5yiii2  + 5ym3  — 3y1222  — 3y1333  — y2223  — y2333)  = —49.58; 

g 

P1223  = 32(3  y1223  — y1123  — y3233)  + - (6y2  — yl  — y3)  — 16(y12  + y23) 

16 

^“(3yi222  + 3y2223  — 3y1112  — 3y2333  — y1113  — y3333)  = 159.32; 

g 

P1233  = 32(3y1233  — y1123  — y3223)  + - (6y3  —yl  — y2)  — 16(y13  + y23) 

1 6 

^(5yi333  + 5y2333  — 3y1113  — 3y2223  — y1112  — y3222)  = —145.94; 

The  fourth-order  regression  model  for  the  melting  point  has  the  form: 
y = 72.75Xj  + 54.25X,  + 57.25X3  + 9.0XJX,  + 14.0XjX3 
— 13.0X2X3  + 24.03XjX2(Xj  - X2)  - 66.8X!X3(Xj  - X3) 

+5.34X2X3(X2  -X3)  - 38.72X1X2(X1  - X2)I 2  (3.63) 

+18.69X1X3(X1  -X3)2-6.68X2X3(X2  -X3)2 
-49.58X2X2X3  + 159.32X1X22X3  - MS^X^X2 


Since  the  obtained  regression  model  adequately  describes  experimental  out- 
comes, a check  was  done  by  a second-order  regression  model: 

y = 72.75X3  + 54.25X2  + 57.25X3  + 9.0X3X2  + 14.0XjX3  - 13.0X2X3  (3.64) 

Check  of  lack  of  fit  in  control  point  No.  7: 

. N n 

4 = 77  E = 0 0893; Qu  = — E (Yuk  - f)2; 

U— 1 n 1 k= 1 

f=N(n-l);  a=0.05;  ^=l;n=2. 

where: 

N-is  total  number  of  points  in  a simplex  lattice  design,  including  the  control  point; 
u-is  current  number  of  points  in  a simplex  lattice  design; 

k<n-is  current  number  of  replicated  design  points  in  one  point  of  a simplex  lattice 
design. 

For  control  point  7: 

I = E 4 + E 4; = xi(2Xi  - !); = 4 xixj. 

ax  = 0.38;  a2  = 0.125;  a3  = 0.0. 

a12  = 0.75;  a13  = 0.0;  a23  = 0.0  =>  § = 0.7187. 

Based  on  the  obtained  values  it  follows: 


0.0893 

2 


x 0.7187  = 0.0321. 
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The  calculated  value  of  melting  point  from  Eq.  (3.64)  is: 
y7  = 69.8125 

The  experimental  value  in  point  7 is: 


y = 70.25 
It  follows  that: 

Ay7  = \y7  -y7\=  0.44;  tR  = 
f = N(n  — 1)  = 7(2  — 1)  = 7 


Ay7v» 
SyV  1 + 


0.44y/2 

0.2988  v/1  + 0.7187 


1.5885; 


From  the  table  for  Students  distribution  we  get: 
t0  05  7 = 2.36  ; tR  = 1.5885  -<  tT  = 2.36 

Hence,  the  regression  model  (3.64)  is  adequate.  By  analogy,  an  adequate  regres- 
sion model  for  penetration  has  also  been  obtained: 

y'  = 22.5X1  + 8.0X2  + 156.0X3  - IH.OOX^  + 837.00XxX3  - 58.0X2X3  (3.65) 

The  desired  composition  is  obtained  by  solving  this  simultaneous  system  of  equa- 
tions: 

( 66.0  = 72.75Xx  + 54.25X2  + 57.25X,  + 9.0XxX2  + 14. 0XxX3  - 13.0X2X3 
^ 18.0  = 22.5Xx  + 8.0X2  + 156.0X3  - 13.0XxX2  + 837.0XxX3  - 58.0X2X3 

1 .0  = Xx  + x2  + x3 

Xx  = 0.6342  ; X2  = 0.3604  ; X3  = 0.0054 

The  obtained  composition,  or  its  properties,  completely  satisfy  the  requirements. 

Example  3.10  [14] 

To  obtain  maximal  bulk  density  or  maximal  quality  of  coke,  a study  has  been  done 
by  five  coal  granulations.  Coal  granulations  were  obtained  by  crushing  and  classifi- 
cation using  sieves.  The  experiment  was  performed  by  a simplex  lattice  design 
matrix  for  a third-order  regression  model.  Design  points  were  done  by  mixing  initial 
coal  granulations  in  ratios  as  defined  by  the  design  of  experiment  matrix.  Bulk  den- 
sity was  measured  in  an  Agroskina  apparatus,  by  replicating  design  points  five  times. 
Particle-size  properties  of  the  initial  coal  are  given  in  Table  3.20. 

Regression  coefficients  are  calculated  by  expressions  (3.28): 


Table  3.20  Coal  granulation 


Size  of  particles  [mm]  6-3 

3-1 

1-0.5 

0.5-0.25  0.25-0 

Ratios  of  masses  X1 

x2 

X3 

X4  X5 
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y = 658. 3Xx  + 674.4X2  + 627.8X3  + 616.1X4  + 639.6X5  + 147 lXxX2 
+431.5XjX3  + 551.2XxX4  + 675.9XjX5  + 160.4X2X3  + 350.2X2X4 
+518.3X2X5  + 106.0X3X4  + 374.0X3X5  + 204.7X4X5  + 76.3X1X2(X1  - X2) 
+129.3X1X3(X1  -X3)  +416.5X1X4(X1  - X4)  + 538.8X1XS(X1  - X5) 

+86.2X2X3  (X2  - X3)  + 235.8X2X4(X2  - X4)459.0X2X5  (X2  - X5) 

+45.1X3X4(X3  -X4)  + 329.8X3Xs(X3  - X5)  + 155.0X4X5(X4  - X5) 
+147.8XjX2X3  + 448.3XxX2X4  + 855.8XjX4X5  - 420.2X1X3X4 
+381.9XjX3X5  - 221.3X1X4X5  - 48.2X2X3X4  + 624.2X2X3X5 
— 57.0X2X4X5  + 326.2X3X4X5 

Table  3.21  Simplex  lattice  design  {5,3} 


Response 

Mark 

Mass  ratios  of  components 

Y 

s 2 

X, 

X2 

x3 

x4 

x5 

yi 

1 

0 

0 

0 

0 

658.29 

9.25 

y 2 

0 

1 

0 

0 

0 

674.36 

7.84 

ys 

0 

0 

1 

0 

0 

627.78 

11.51 

y4 

0 

0 

0 

1 

0 

616.13 

4.88 

ys 

0 

0 

0 

0 

1 

639.58 

2.49 

yizs 

0.333 

0.333 

0.333 

0 

0 

741.13 

46.76 

yi24 

0.333 

0.333 

0 

0.333 

0 

782.77 

16.49 

yns 

0.333 

0.333 

0 

0 

0.333 

828.21 

17.43 

yi34 

0.333 

0 

0.333 

0.333 

0 

739.48 

43.63 

yns 

0.333 

0 

0.333 

0 

0.333 

820.64 

12.80 

yi45 

0.333 

0 

0 

0.333 

0.333 

788.90 

15.80 

y234 

0 

0.333 

0.333 

0.333 

0 

706.15 

8.97 

y235 

0 

0.333 

0.333 

0 

0.333 

787.33 

1.45 

y245 

0 

0.333 

0 

0.333 

0.333 

760.49 

5.75 

y345 

0 

0 

0.333 

0.333 

0.333 

716.00 

1.03 

ym 

0.666 

0.333 

0 

0 

0 

702.12 

3.87 

ym 

0.666 

0 

0.333 

0 

0 

765.56 

8.95 

yii4 

0.666 

0 

0 

0.333 

0 

797.59 

44.10 

yns 

0.666 

0 

0 

0 

0.333 

842.17 

16.76 

y223 

0 

0.666 

0.333 

0 

0 

700.86 

3.54 

y224 

0 

0.666 

0 

0.333 

0 

750.24 

0.93 

y225 

0 

0.666 

0 

0 

0.333 

811.95 

3.15 

y334 

0 

0 

0.666 

0.333 

0 

650.80 

36.92 

y335 

0 

0 

0.666 

0 

0.333 

739.27 

7.62 

y445 

0 

0 

0 

0.666 

0.333 

680.92 

25.60 

yi22 

0.333 

0.666 

0 

0 

0 

696.18 

22.69 
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Table  3.21  (continued) 


Response 

Mark 

Mass  ratios  of  components 

Y 

s 2 

X, 

X2 

x3 

x4 

x5 

Yus 

0.333 

0 

0.666 

0 

0 

712.30 

34.87 

yw 

0.333 

0 

0 

0.666 

0 

721.83 

5.94 

yi55 

0.333 

0 

0 

0 

0.666 

756.11 

1.23 

yi33 

0 

0.333 

0.666 

0 

0 

672.56 

59.38 

y244 

0 

0.333 

0 

0.666 

0 

695.89 

0.98 

y255 

0 

0.333 

0 

0 

0.666 

732.35 

11.47 

y344 

0 

0 

0.333 

0.666 

0 

640.23 

23.33 

y355 

0 

0 

0.333 

0 

0.666 

694.34 

10.04 

y455 

0 

0 

0 

0.333 

0.666 

665.77 

3.56 

St2=15.17;  f =35 (5-l)=140 

Check  of  lack  of  fit  of  the  obtained  regression  model  is  done  in  additional  control 
points,  Table  3.22,  since  the  simplex  lattice  design  is  saturated. 


Table  3.22  Control  design  points 


Response 
Marks  _ 

Mass  ratios  of  components 

Response 

Sy2 

x, 

X2 

x3 

x4 

x5 

Y 

Y 

yi234 

0.25 

0.25 

0.25 

0.25 

0 

748.40 

755.30 

2.34 

yi235 

0.25 

0.25 

0.25 

0 

0.25 

825.78 

825.64 

2.34 

yi245 

0.25 

0.25 

0 

0.25 

0.25 

817.31 

816.12 

2.34 

yi345 

0.25 

0 

0.25 

0.25 

0.25 

782.12 

782.95 

2.34 

y2245 

0 

0.25 

0.25 

0.25 

0.25 

757.09 

759.77 

2.34 

y2345 

0.20 

0.20 

0.20 

0.20 

0.20 

797.65 

800.33 

1.77 

yiiu 

0.15 

0.25 

0.15 

0.20 

0.25 

798.80 

799.44 

1.74 

yni2 

0.10 

0.20 

0.20 

0.25 

0.25 

783.33 

777.08 

1.78 

A check  of  lack  of  fit  by  relation  (3.59)  shows  that  the  regression  model  is  ade- 
quate with  95%  confidence. 

Example  3.11 

Nine  design  points  in  accord  with  a simplex  lattice  design  for  a second-order  model 
was  done  in  researching  for  the  octane  number  of  a three-component  mixture  of 
petrol.  The  outcomes  are  given  in  Table  3.23. 


Table  3.23  Simplex  lattice  design  {3,2} 
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Number 
of  trials 

Ratios  of  components 

Response 

X, 

x2 

x3 

/ 

y" 

Y 

1 

1.00 

0.00 

0.00 

100.8 

100.9 

100.85 

2 

0.00 

1.00 

0.00 

85.2 

85.6 

85.40 

3 

0.00 

0.00 

1.00 

86.0 

85.0 

85.50 

4 

0.50 

0.50 

0.00 

88.8 

89.3 

89.05 

5 

0.50 

0.00 

0.50 

90.3 

90.7 

90.50 

6 

0.00 

0.50 

0.50 

85.5 

85.4 

85.45 

7* 

0.333 

0.333 

0.333 

88.3 

88.8 

88.55 

8* 

0.150 

0.595 

0.255 

86.6 

86.8 

86.70 

9* 

0.300 

0.490 

0.210 

87.6 

88.1 

87.85 

* Control  points 


By  using  data  from  Table  3.23  and  formulas  (3.26)  regression  coefficients  of  a sec- 
ond-order model  are  calculated. 

P 1=y1  = 100.85;  p2  = y2  = 85.40;  P3  = y3  = 85.50; 

P12  = 2(2yX2  - yx  - y2)  = 2(289.05  - 100.85  - 85.40)  = -16.30; 

P13  = 2(2y13  - yx  - y3)  = 2(2  x 90.50  - 100.85  - 85.50)  = -10.70; 

P23  = 2(2y23  - y2  - y3)  = 2(2  x 85.45  - 85.40  - 85.50)  = 0.00 

The  second-order  regression  model  has  this  form: 

y = 100.85Xx  + 85.40X2  + 85.50X,  - 16.30XxX2  - 10.70XxX3 

Check  of  lack  of  fit  in  control  point  No.  7: 

y y = 100.85  x 0.333  + 85.40  x 0.333  + 85.50  x 0.333  - 16.30  x 0.333  x 0.333 
-10.7  x 0.333  x 0.333  = 87.58 
Ay7  = |y7  -y7|  = |88.55  — 87.58|  =0.97 
tR  = 3.16  >-  0.025;9)  = 2.66 

The  regression  model  is  inadequate.  The  second-order  model  may  be  built  up  to 
an  incomplete  third-order  model  by  including  design  point  No.  7.  By  using  the 
expression  (3.27)  we  get: 

PX23=27  x 88. 55-12. 0(89. 05+90.5+85. 45)+3(100. 85+85. 4+85. 5)=26.1 
The  incomplete  third-order  model  has  the  form: 
y = 100.85Xx  + 85.40X2  + 85.50X3  - 10.30XxX2  - 10.70XxX3  + 26.10XxX2X3 
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A check  of  lack  of  fit  in  control  points  No.  8 and  9 showed  that  the  incomplete 
cube  model  is  adequate.  Graphic  interpretation  of  this  model  as  contour  lines  is 
shown  Fig.  3.11. 


Figure  3.11  Contour  plot  of  a three-component  mixture 

Example  3.12  [12] 

To  study  the  variation  of  reactivity  and  porosity  of  coke  with  different  batch  composi- 
tion, coals  of  four  process  groups  were  analyzed  and  designated  as  X3,  X2,  X3  and  X4. 
Experiments  were  conducted  on  a laboratory  pilot  plant.  The  characteristic  of  coke 
reactivity  was  the  rate  constant  of  the  reaction  C+02=2CO,  determined  at  1000  °C 
(y').  The  coke  porosity  was  given  by  the  ratio  of  the  true  and  apparent  densities  (y"). 
Assuming  that  the  response  surfaces  of  physical  and  chemical  characteristics  of  the 
mixtures  at  hand  can  be  approximated  by  polynomials  of  not  very  high  degree,  we 
shall  seek  the  regression  equation  in  the  form  of  the  second-degree  polynomial. 

In  solving  the  problem,  the  simplex  lattice  design  {4.2}  has  been  utilized.  The 
second-order  design  matrix  for  the  quaternary  system  and  experimental  results 
(each  trial  was  repeated  twice)  are  summarized  in  Table  3.24.  By  processing  these 
outcomes,  the  following  values  of  regression  coefficients  were  obtained: 


= 1.48;  P2  = 0.32;  P3  = 0.50;  P4  = 0.53; 

Pi2  = 4y12  - 2y1  - 2y2  = 4 x 0.63  - 2 x 1.48  - 2 x 0.32  = -1.08; 

Pi3  = 4y13  - 2y1  - 2y3  = 4 x 0.92  - 2 x 1.48  - 2 x 0.50  = -0.22; 

Pu  = 4y14  - 2y3  - 2y4  = 4 x 1.08  - 2 x 1.48  - 2 x 0.53  = 0.30; 

P23  = 4y23  - 2y2  - 2y3  = 4 x 0.39  - 2 x 0.32  - 2 x 0.50  = -0.08; 

P24  = 4y24  - 2y2  - 2y4  = 4 x 0.38  - 2 x 0.32  - 2 x 0.53  = -0.18: 

P34  = 4y34  - 2y3  - 2y4  = 4 x 0.54  - 2 x 0.50  - 2 x 0.53  = 0.10. 


X2  20  40  60  80  X3 
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Thus  the  second-order  polynomial  for  the  reactivity  of  the  quaternary  mixture  has 
the  form: 

y = l.48X1  + 0.32X2  + 0.50X3  + 0.53X4  - 1.08X,X2  - 0.22XlX3 
+0.3X,  X4  - 0.08X2X3  - 0.18X2X4  + 0.1X3X4 


Table  3.24  Simplex  lattice  design  {4,2} 


No. 

Response 

Xt 

x2 

X, 

X4 

/ 

y" 

1 

yi 

1 

0 

0 

0 

1.48 

54.0 

2 

h 

0 

1 

0 

0 

0.32 

55.2 

3 

Is 

0 

0 

1 

0 

0.50 

43.3 

4 

y+ 

0 

0 

0 

1 

0.53 

45.3 

5 

yn 

0.5 

0.5 

0 

0 

0.63 

53.1 

6 

y« 

0.5 

0 

0.5 

0 

0.92 

48.0 

7 

yu 

0.5 

0 

0 

0.5 

1.08 

49.0 

8 

yi3 

0 

0.5 

0.5 

0 

0.39 

46.3 

9 

y24 

0 

0.5 

0 

0.5 

0.38 

47.1 

10 

y34 

0 

0 

0.5 

0.5 

0.54 

44.0 

For  the  porosity: 

Pm 

= 54.0; 

P2  = 

= 55.2; 

P3 

= 43.3; 

P4  = 

45.3; 

Pl3 

= -2.6; 

Pl4 

= -2.6; 

P23 

= -11.8; 

P24  = 

= -12.6; 

and  the  regression  equation  will  be: 

y"=54.0X1+55. 2X2+43. 3X3+45. 3X4-6.4X1X2-2.6X1X3-2.6X1X4-11.8X2X3-12.6X2X4-1.2X3X4 

To  test  the  adequacy  of  equations  derived,  25  test  points  are  used-Table  3.25. 

The  coordinates  of  these  are  selected  so  that  a fourth-degree  polynomial  can  be 
built,  if  the  regression  equations  fail  to  fit  adequately.  The  replication  errors  are: 

Sf  = 0.075;  Sf  = 1.5. 

The  number  of  degrees  of  freedom  was  f =35.  The  number  of  replicate  observa- 
tions at  each  point  is  n=2.  At  a significant  level  a=0.05,  and  f =35,  tT(o.o5;35)=3.6.  Thus, 
both  equations  are  found  to  adequately  fit  the  experiment. 


502 


III  Mixture  Design  “Composition-Property1 
Table  3.25  Control  points 


No. 

Response 

i 

Y 

Ay' 

y" 

y" 

Ay" 

1 

*R 

1 

Yuu 

0.77 

0.99 

0.22 

53.5 

53.1 

0.4 

0.72 

3.16 

0.3 

2 

Yuu 

1.15 

1.19 

0.04 

51.0 

50.8 

0.2 

0.72 

0.575 

0.15 

3 

Ynu 

1.25 

1.20 

0.05 

50.3 

51.3 

1.0 

0.72 

0.72 

0.72 

4 

Y2221 

0.31 

0.34 

0.03 

49.0 

50.1 

1.1 

0.72 

0.43 

0.80 

5 

Y2224 

0.39 

0.34 

0.05 

52.3 

50.4 

1.9 

0.72 

0.72 

1.37 

6 

Y3334 

0.55 

0.52 

0.03 

45.0 

43.6 

1.4 

0.72 

0.43 

0.5 

7 

Y\222 

0.35 

0.41 

0.06 

57.2 

53.8 

2.6 

0.72 

0.86 

0.7 

8 

71333 

0.75 

0.7 

0.05 

44.0 

45.5 

1.5 

0.72 

0.72 

1.2 

9 

Yl444 

0.94 

0.82 

0.12 

48.9 

47.0 

1.9 

0.72 

1.72 

0.4 

10 

Y2333 

0.51 

0.44 

0.07 

43.4 

44.0 

0.7 

0.72 

1.0 

0.75 

11 

Y2444 

0.36 

0.45 

0.09 

46.3 

45.4 

0.9 

0.72 

0.74 

0.8 

12 

Y3444 

0.57 

0.50 

0.07 

44.5 

44.6 

0.1 

0.72 

1.34 

0.6 

13 

Yll23 

0.82 

0.77 

0.05 

52.4 

49.8 

2.6 

0.59 

0.74 

1.95 

14 

Yll24 

0.90 

0.81 

0.09 

51.5 

50.1 

1.4 

0.59 

1.3 

1.2 

15 

Yll34 

1.17 

1.00 

0.17 

47.0 

48.4 

1.4 

0.59 

2.54 

0.3 

16 

Yl223 

0.49 

0.51 

0.02 

50.6 

49.5 

1.1 

0.59 

0.3 

0.5 

17 

Yl224 

0.52 

0.53 

0.01 

48.0 

49.8 

1.8 

0.59 

0.15 

0.4 

18 

Y1334 

0.76 

0.75 

0.01 

46.7 

45.8 

0.9 

0.59 

0.15 

1.3 

19 

Y2234 

0.44 

0.40 

0.04 

48.4 

46.5 

1.9 

0.59 

0.6 

0.7 

20 

Y2334 

0.48 

0.46 

0.02 

46.8 

44.4 

2.4 

0.59 

0.3 

1.0 

21 

Yl233 

0.58 

0.63 

0.05 

45.7 

46.7 

1.0 

0.59 

0.74 

1.4 

22 

Yl244 

0.59 

0.68 

0.09 

47.0 

47.6 

0.6 

0.59 

1.34 

0.8 

23 

Yl344 

0.78 

0.80 

0.02 

49.0 

46.4 

2.6 

0.59 

0.3 

1.2 

24 

Y2344 

0.42 

0.44 

0.02 

46.0 

44.8 

1.2 

0.59 

0.3 

0.8 

25 

Yl234 

0.60 

0.65 

0.05 

48.0 

47.0 

1.0 

0.44 

0.78 

0.79 

3.4 

Simplex  Centroid  Design 

Scheffe’s  simplex  centroid  designs  contain  2q-l  points,  q of  which  fall  on  straight 
components,  Cq2  on  binary  mixtures,  Cq3  on  ternary  mixtures,  and  so  forth,  and  one 
observation  on  a q-component  mixture.  Simplex  centroid  designs,  consist  of  the 

points  whose  coordinates  are  (1,0,... ,0),  (1/2,  1/2,0 0),...,(l/q,l/q,...,l/q),  and  of  all 

the  points  that  can  be  obtained  from  these  by  permutations  of  coordinates.  Thus, 
the  design  contains  a point  at  the  center  (centroid)  of  the  simplex  and  the  centroids 
of  all  the  component  simplexes  of  lesser  dimension,  its  proper  faces. 

Polynomials  obtained  from  simplex-centroid  designs  contain  as  many  coefficients 
as  there  are  points  in  the  design,  and  for  the  q-component  mixture  they  have  the 
form: 

?=  E Pixi+  E iy <iXj+  E VijiXiXjXk  + Pi2...9XiX2...X9  (3.66) 

1 <i<q  1 <i<j<q  l<i~g~<k<q 
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For  a given  number  of  components  q,  there  exists  only  one  simplex-centroid 
design.  The  simplex-lattice  design  intended  to  derive  the  polynomial  of  incomplete 
third-degree  is  the  simplex-centroid  design  for  a ternary  mixture  (Fig.  3.10b).  By  way 
of  example,  we  build  the  simplex-centroid  design  for  a quaternary  system  (c^4).  The 
number  of  observations  in  the  design  is  N=2q-l=24-l=15.  The  arrangement  of  points 
over  the  concentration  tetrahedron  is  shown  in  Fig.  3.10c,  and  the  respective  sim- 
plex-centroid design  is  shown  in  Table  3.26. 


Table  3.26  Matrix  of  simplex  centroid  design  for  a quaternary  system  q=4 


N 

Xr 

x2 

X, 

x4 

y 

N 

Xi 

x2 

x3 

x4 

y 

1 

1 

0 

0 

0 

yi 

9 

0 

1/2 

0 

1/2 

y24 

2 

0 

1 

0 

0 

Yi 

10 

0 

0 

1/2 

1/2 

yj4 

3 

0 

0 

1 

0 

Yi 

11 

1/3 

1/3 

1/3 

0 

ym 

4 

0 

0 

0 

1 

Yi 

12 

1/3 

1/3 

0 

1/3 

yi24 

5 

1/2 

1/2 

0 

0 

yn 

13 

1/3 

0 

1/3 

1/3 

yi34 

6 

1/2 

0 

1/2 

0 

yi? 

14 

0 

1/3 

1/3 

1/3 

y234 

7 

1/2 

0 

0 

1/2 

yu 

15 

1/4 

1/4 

1/4 

1/4 

yX234 

8 

0 

1/2 

1/2 

0 

Yn 

The  polynomial  of  Eq.  (3.67)  for  q=4  includes  15  terms  and  has  the  form: 

y = Pj-^i  + P2X2  + P3x3  + P4X4  + P12^i  x2  + P13xxx3  + p14xxx4  + p23x2x3 
+p24x2x4  + P34X3X4  + P123XxX2X3  + P124XxX2X4  + P134XxX3X4  + P234X2X3X4 
+P1234XxX2  X3  X4  ( 3.67) 

Making  recourse  to  the  saturation  property  of  the  design  and  substituting  in  suc- 
cession the  coordinates  of  experimental  points  1 through  15  into  the  polynomial  of 
Eq.  (3.67),  we  determine  the  polynomial  coefficients: 

Px=yi  ; P2  = y2  ; P3  = y3  ; P4  = Tt  ; (3-68) 

PX2  = 4712  - 2yx  - 2y2; 

PX3  = 4Fi3  - 2yx  - 2y3; 

PX4  = 47i4  - 2yx  - 2y4; 

P23  = 4y23  - 2y2  - 2y3; 

P24  = 4y24  - 2y2  - 2y4; 

P34  = 4y34  - 2y3  - 2y4; 


(3.69) 
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P123  — 27y123  — 12  (y12  + y13  + y23)  + 3(yx  + 72  + Yi) 

P124  = 27yi24  - i2(yX2  + yi4  + y24)  + 3(yx  + y2  + y4) 

P134  = 27y134  - 12(y13  + y14  + y34)  + 3(yx  + y3  + y4)  (3.70) 

P234  = 27 y234  - 12(y23  + y24  + y34)  + 3(y2  + y3  + y4) 

Px234  = 256y1234  — 108(y123  + y124  + y134  + y234)  + 32(y12  + y13  + y14  + y23 

+724  + 734)  — 4(7i  + y2  + 73  + 74)  (3-71) 

In  a similar  manner,  for  the  polynomial  of  Eq.  (3.66),  for  the  q-component  mix- 
ture, regression  coefficients  are  calculated  as  follows: 


p,= 

Yi 

(3.72) 

P*r  = 

4^ 

1 

to 

1 

27; 

= 2[2Yij-{Yi+Yj)} 

(3.73) 

Py  k = 

= 27 Yijk  ~ 12 

{yij 

+ Yik  + Yjk ) + 3 (y;  + Yj  +Yk ) 

= 3 [9y..fe  - 

A 

7y  + Yik  + Yjk)  + (f;  + 7j  + 7ii)] 

(3.74-) 

Pyfem 

= 256 Yijkm  ~ 

- 108  (y^  + yijm  + yikm  + yjkm)  + 32^.  + yik  + yim  + yjk 

+7jm  + Ykm  ) ~ 4(h  + Yj  + Yh  + 7rn)  - 4 [MYijkm  ~ 27  (y ijk  + Yijm  + Yikm ) 
+8  (y,j  + 7ifc  + 7im  + Tjfe  + 7/m  + 7fcm)  “ (fi  + 7;  + 7fc  + 7m)  ] (+75) 


In  the  general  case,  the  formula  for  coefficients  of  regression  equation  obtained 
from  the  simplex-centroid  design,  takes  the  form  [6]: 

r 

P„...='-E(-l)r'‘tr'1St  (3.76) 

t 

where: 

r-is  the  number  of  indices  at  coefficients  |3ip 

S,-is  the  sum  of  experimental  results  for  all  the  t-component  mixtures  taken  in  equal 
proportions  (1/t). 

For  example,  for  the  |3;jk  coefficients  we  have  r=3(i,  j,  k)  and  three  sums: 


yi+Yj+y^Si  l/t=l  (3.77) 

yij+yjk+yik=s2;  i/t=i/2  (3.78) 

yijk=S3;  l/t=l/3  (3.79) 

Thus: 

Pyfe  = 3 |^(— l)3  1 13”1 5X  + (-1)3+(-1)3^2x23-1S2  + (-1)3_333-1S3] 

= 3 [(yf  + Yj  + Yic)  - 4 (fi/  + 7ifc  + Yjk)  + 9 Yijk \ (3-80) 
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Adequacy  of  a regression  equation  derived  by  the  simplex-centroid  design  is 
tested  and  the  confidence  intervals  of  property  values,  predicted  by  the  equation,  are 
assigned  in  much  the  same  way  as  in  the  case  of  the  simplex-lattice  method. 

Example  3.13  [12] 

We  seek  to  determine  how  the  activity  (yx)  and  durability  (y2)  of  a platinum  catalyst 
supported  by  a nonporous  metal  carrier  depend  on  the  catalyst  composition  at 
350  °C.  The  total  mass  of  components  was  maintained  constant  from  experiment  to 
experiment.  Taking  it  to  be  unity,  we  can  write: 

3 

Exi  = 1 

i=  1 

where: 

Xrthe  component  of  Pt/Al203  was  a reforming  catalyst; 

X2  and  X3-also  components,  were  inorganic  oxides  of  metals  belonging  to  Groups  II 
and  III  of  the  periodic  table. 

The  simplex  centroid  design  for  q=3  is  applied.  The  design  matrix  and  experimen- 
tal results  are  arrayed  in  Table  3.27. 

Table  3.27  Simplex  centroid  design  for  q=3 


N 

X, 

X2 

x3 

yi 

n 

1 

1 

0 

0 

97.4 

62 

2 

0 

1 

0 

3.0 

73 

3 

0 

0 

1 

4.7 

47 

4 

0.5 

0.5 

0 

70.0 

64 

5 

0.5 

0 

0.5 

66.0 

55 

6 

0 

0.5 

0.5 

6.8 

72 

7 

0.333 

0.333 

0.333 

95.4 

67 

Using  Eq.  (3.27)  and  Table  3.27,  the  coefficients  of  regression  equations  are  de- 
rived for  both  the  catalyst  activity  and  durability: 

y j = 97.4Xx  + 3.0X2  + 4.7X3  + 79.3X3X2  + 59.9XxX3  + 11.8X2X3 

+175.35X^X3  (3.81) 

y2  = 62Xx  + 73X2  + 47X3  - 14XxX2  + 2X1X3  + 48X2X3  + 63XxX2X3  (3.82) 

The  replication  error  in  measuring  the  catalyst  activity  is  Syi=3.24,  and  durability 
Sy  =2.37.  The  adequacy  of  the  regression  equations  (3.81)  and  (3.82)  is  tested  using 
the  Student  t-test  at  the  control-test  points  8,  9 and  10,  Table  3.28. 
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Table  3.28  Check  of  lack  of  fit  of  regressions  - Eqs.  (3.81);  (3.82) 


N 

Xn 

x2 

x3 

yi 

Vi 

yi 

y 2 

8 

0.333 

0.667 

0 

46 

52 

72 

66 

9 

0.667 

0.333 

0 

96 

84 

63 

70 

10 

0.580 

0.320 

0.097 

91 

98 

62 

65 

At  all  the  test-control  points,  the  values  of  the  t-test  are  less  than  the  tabulated 
value  at  the  significance  level  a=0.05.  Fig.  3.12  presents  the  lines  of  constant  catalyst 
activity  and  durability  plotted  from  Eqs.  (3.81)  and  (3.82).  The  greatest  activity  of  the 
catalyst  corresponds  to  the  area  where  the  values  of  the  component  Xx>0.4.  The  dur- 
ability of  65  per  cent  appears  quite  satisfactory.  Of  greatest  interest  are  the  points 
lying  where  the  equal  yield  curves  for  y2=65[%]  and  y1=100[%]  intersect.  The  trial  10 
executed  within  the  specified  area  gave  a good  agreement  (within  experimental 
error)  of  experimental  results  with  theory. 


Figure  3.12  Isolinies  for  y,  ( ) and  y2  ( ) 


3.5 

Extreme  Vertices  Designs 

It  has  been  explained  that  when  testing  mixture  diagrams,  factor  space  is  usually  a 
regular  simplex  with  q-vertices  in  a q-1  dimension  space.  In  such  a case,  the  task  of 
mathematical  theory  of  experiments  consists  of  determining  in  the  given  simplex 
the  minimum  possible  number  of  points  where  the  design  points  will  be  done  and 
based  on  which  coefficients  of  the  polynomial  that  adequately  describes  system  be- 
havior will  be  determined.  This  problem,  for  the  case  when  there  are  no  limitations 
on  ratios  of  individual  components,  as  presented  in  the  previous  chapter,  was  solved 
by  Scheffe  in  1958  [5].  However,  a researcher  may  in  practice  often  be  faced  with 
multicomponent  mixtures  where  definite  limitations  are  imposed  on  ratios  of  indi- 
vidual components: 

(Ka^X^b^l;  £ XF1; 


i=1.2,...,q 
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where  a;  and  bj  correspond  to  upper  and  lower  limit  ratios  of  the  i-th  component. 

This  situation  occurs  when  doing  experiments  in  simplex  vertices  has  no  physical 
sense,  or  when  the  researcher  is  only  interested  in  a local  region  of  a simplex  space. 

In  this,  the  researched  local  region  is  usually  a multiangle  whose  vertices  and  the 
center  represent  the  gathering  of  points  of  an  experimental  design.  Depending  on 
the  number  of  components,  the  local  region  may  be: 


1.  The  region  studied  is  a simplex 

The  local  area  of  interest  in  the  diagram  may  be  an  irregular  simplex  with 
unknown  vertex  coordinates: 


4 (y{1)  y(1)  y(1)N\  4 (V2>  v(2>  y<2)N\  4 /V«>  Y^  Y^\ 

A1  I A1  , A2  ):A2lAl  j A2  , -.-jXj  ):--^AjlAl  , A2  , - - - , Aq  )■ 


In  order  that  Scheffes  simplex  lattice  designs  may  be  applied  to  this  case,  a renor- 
malization is  performed  and  compositions  at  vertices  Aj(j=l,  2,...,  q)  are  taken  to  be 
independent  pseudocomponents  so  that  for  all  the  range  of  the  local  simplex  the  con- 
dition be  met: 


EZ;  = 1 (3.83) 


The  experimental  design  is  in  the  pseudocomponent  coordinates.  All  the  designs 
discussed  earlier  can  be  built  in  the  new  variables  Zj,  Z2,...,  Zq  that  satisfy  the  condi- 
tion of  Eq.  (3.83).  To  conduct  the  experiments  it  is  required  to  convert  the  pseudo- 
components Z;  into  the  initial  components  with  real  ratios  X;.  For  the  u-th  design 
point  this  conversion  is  defined  by  the  formula: 

xP  = xV  + ziu)  (x,(2)  - xV)  + ziu)  (xr  - xf  >) 

+...  + z'M)  (x/?)  -x!1})  (3.84) 

where  Xj  is  the  i-th  component  content  of  the  vertex  Zj(Aj). 

After  the  design  has  been  realized  the  coefficients  of  the  regression  equation  are 
calculated  in  pseudocomponent  coordinates: 


y —f  (Z1 , Z2, ...,  Zq)  (3.85) 

using  earlier  relationships  for  pertinent  designs,  and  then  adequacy  of  fit  is  tested. 
For  Eq.  (3.85)  to  be  applied  in  practice,  it  is  written  in  the  initial  coordinate  system 
using  an  affine  transformation  of  the  form: 


Zi  = Z(r]  + X2  (zj2)  - Z™)  + X3  (z$3)  - Z«)  + ...  + X,  (z^ 
z2  = z§)  +x2(yz(2)  -z^)  + x3(zi3)  -zl1’)  + ...  + Xq(zP 

zq_ ! = Z(q\  + x2  (zf\  - zft)  + x3  (z'i\  - z£.\)  + ...  + Xq 


> (3.86) 

)J 
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Values  ofZi(,)  are  calculated  solving  (q-1)  sets  of  equations  below 


X™ 

+ 

x2(1) 

Z$2) 

+ 

X<3) 

Z?> 

+ . 

..  + 

v(l) 

7(«) 
Zi  = 

= 1 

x<2) 

+ 

xi2) 

Z$2) 

+ 

x<2) 

zfJ 

+ . 

..  + 

v(2) 

Ag 

z[q)  = 

= 0 

x“ 

Z« 

+ 

xie) 

z|2) 

+ 

x y 

7(3) 

A 1 

+ .. 

. + 

Xg^ 

z[q)  = 

= 0 

Xi(1) 

z<X) 

+ 

X2(1) 

z!2) 

+ 

X<x) 

z?> 

+ . 

..  + 

**(1) 

7(«) 
Z2  - 

= 0 

xf> 

Zp> 

+ 

x<2) 

zi2) 

+ 

x<2) 

z?> 

+ . 

..  + 

v{ 2) 
Ag 

ziq)  - 

= 1 

x“ 

& 

+ 

xje) 

Z<2) 

+ 

x“ 

z<3) 

+ .. 

. + 

Xg^ 

Z[2q)  = 

= 0 

V(1  )7W  , vdlyP)  , v(!)7(3)  , , YU)  7(«)  n 3 

Al  Zq-1  + A2  Zg_l  + A3  Zg_l  + ...  + Ag  Zg_l  = U 

v(2)  7(1>  , v(2)  7(2)  , v(2)  70)  , , „(2)  _(«) 

Zl  Zg — 1 + A2  Zg_l  + A3  Zg_l  + ...  + A g Zg_l  = U V 

Y(«)  7(!)  , Y(«)  7(2)  , y(«)  7(3)  Y(«)  7(?)  3 

Zg_!+A2  Zg_!  + A3  Zg_!+...+Ag  Zg_J  — 1 ) 

where: 

Zi*'*-is  the  pseudocomponent; 

Zj-content  of  vertices  of  the  initial  simplex; 

Xj^-is  the  i-th  component  content  of  vertices  Zj  (Aj)(j=l,  2,...,  q). 

As  such  a coordinate  conversion  is  only  possible  for  equations  in  independent 
variables,  the  initial  regression  equation  shall  be  transformed  eliminating  one  vari- 
able , e.g.  the  last  one,  the  q-th  as  follows: 

9-1 

Z,  = 1-EZ|  (3-88) 

i— 1 

Example  3.14  [13] 

The  boiling  of  the  ternary  mixture  H2O-K2HPO4-K2CO3  is  studied.  It  is  required  to 
define  the  regression  equation  for  the  boiling  point  y °C  on  the  mixture  composition 
(in  per  cent).  Not  all  the  concentration  triangle  is  covered,  but  only  a subarea  of  un- 
saturated solutions  at  20  °C,  i.e.  a local  section  of  the  diagram  in  the  form  of  a trian- 
gle with  the  vertices  Zi(100;  0.0),  Z2(40;  60.0)  and  Z3(50;  0.50)  Fig.  3.13. 

To  deduce  the  regression  equation,  an  extreme-vertices  design  is  performed  with 
pseudocomponents  Zj,  Z2  and  Z3;  and  the  content  of  initial  components  is  then  de- 
termined from  Eq.  (3.84).  The  regression  equations  of  the  second  and  incomplete 
third  order  are  found  to  be  inadequate.  Using  the  property  of  composition  of  sim- 
plex lattice  designs,  the  design  matrix  is  augmented  further  to  yield  fourth-order 
regression  equations-Table  3.29.  The  experimental  conditions  are  expressed  in  terms 
of  pseudocomponents  Z;  and  in  the  natural  scale  X (per  cent).  The  mean  values  of 
temperature  measurements  are  determined  from  two  replicate  observations.  The 
replication  error  is  S^0.86.  The  number  of  degrees  of  freedom  for  the  error  is  f =20. 

The  coefficients  of  the  fourth-order  regression  equation  are  calculated  by 
Eq.  (3.29)  using  the  property  of  saturated  design  matrix.  The  regression  equation  in 
pseudocomponent  variables  has  the  form: 
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X2 


k2co3 


Figure  3.13  Extreme  vertices  simplex  of  a three-component  mix- 
ture: K2HPO4-K2CO3-H20 

y = 99.81  Zj  + 113.51  Z2  + 115.69 Z3  - 14.22  Z3Z2  - 12.13  Z,Z3  +0.91Z2Z3 
+6.18Z1Z2(Z1  - Z2)  + 10.12Z1Z3(Z1  - Z3)  - 15.34Z2Z3(Z2  - Z3) 
+6.90Z1Z2(Z1  - Z2)2-17.61Z1Z3(Z1  - Z3)2+6.32Z2Z3(Z2  - Z3)2 
+1.07Z2Z2Z3  — 274.61  ZjZ2Z3  + 142.21  Z3Z2Z3  (3.89) 


Table  3.29  Extreme  vertices  design 


No.  trials 

Zi 

z2 

z3 

X, 

x2 

x3 

y 

1 

1 

0 

0 

100 

0 

0 

99.9 

2 

0 

1 

0 

40 

60 

0 

113.5 

3 

0 

0 

1 

50 

0 

50 

115.7 

4 

0.5 

0.5 

0 

70 

30 

0 

103.1 

5 

0.5 

0 

0.5 

75 

0 

25 

104.8 

6 

0 

0.5 

0.5 

45 

30 

25 

114.8 

7 

0.333 

0.333 

0.333 

63.33 

20 

16.67 

105.6 

8 

0.75 

0.25 

0 

85 

15 

0 

101.5 

9 

0.25 

0.75 

0 

55 

45 

0 

107.2 

10 

0.75 

0 

0.25 

87.5 

0 

12.5 

101.6 

11 

0.25 

0 

0.75 

62.5 

0 

37.5 

107.7 

12 

0 

0.75 

0.25 

42.5 

45 

12.5 

112.5 

13 

0 

0.25 

0.75 

47.5 

15 

37.5 

116.4 

14 

0.5 

0.25 

0.25 

72.5 

15 

12.5 

103.4 

15 

0.25 

0.5 

0.25 

57.5 

30 

12.5 

101.4 

16 

0.25 

0.25 

0.5 

60 

15 

25 

109.0 

17* 

0.2 

0.2 

0.6 

58 

12 

30 

108.3 
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Table  3.29  (continued) 


No.  trials 

z. 

z2 

z3 

Xr 

X2 

x3 

Y 

18* 

0.5 

0.125 

0.375 

73.75 

7.5 

18.75 

103.3 

19* 

0.4 

0.15 

0.45 

68.5 

9 

22.5 

104.2 

20* 

0.3 

0.175 

0.525 

63.25 

10.5 

26.25 

106.2 

Table  3.30  summarizes  the  results  of  the  regression  equation  testing  for  adequacy 
of  fit: 

Table  3.30  Control  points 


No.  of  trials 

Y 

Y 

Ay 

1 

tR 

17 

108.3 

110.7 

2.4 

1.3 

2.16 

18 

103.3 

100.9 

2.4 

1.0 

2.27 

19 

104.2 

107.0 

2.8 

1.0 

2.66 

20 

106.3 

108.7 

2.4 

1.1 

2.16 

The  tabulated  value  of  Student’s  t is  tT(0.os /4,20)=2 .8 . Equation  (3.89)  is  an  adequate 
fit  to  the  experiment.  In  Eq.  (3.89),  we  convert  from  pseudocomponents  Z;  to  initial 
variables  X;.  For  the  problem  in  hand,  the  sets  of  simultaneous  equations  (3.87)  take 
the  form: 


izr  + OZi  + OZi  = 1 

0 .4Z[1)  + 0.6Zi2)  + 0 Zi3)  = 0 
0.5Z{1}  +0Z[2)  + 0.5Z$3)  = 0 
1Z21'  + 0Z22)  + 0Z23)  = 0 

0.4Z23)  + 0.6Z22)  + 0z£3)  = 1 

O.SZ^  + 0Z22)  + 0.5zi3)  = 0 _ 


Solving  these  we  obtain: 

zf°  = 0.7; 


Zi]  = 1; 


Z'3)  = -1  ; 


Z I1'  = 0; 


Zi2)  = 1.7; 


Z?>  = 0; 


Substituting  the  above  solutions  into  Eq.  (3.86)  we  arrive  at  formulas  relating  nat- 
ural coordinates  X;  to  coordinates  Z;: 


Z , = 1 - 1.7X2  - 2X, 

Z2  = 1.7X2 

Z,  = 1 - Z,  - Z,  = 2X, 


(3.91) 


And,  substituting  Eq.  (3.91)  into  Eq.  (3.89),  we  get  the  regression  equation  in  ini- 
tial coordinates: 

y = 99.88Xj  + 20.82X2  - 7.63X3  + 92.88X2X3  - 107.83X22  + 279.28X3 
-1373.69 X22X3  - 243.59X2X2  + 2230.35X22X2  + 312.78X23  - 965.12X33 
+2146.05X2X3  - 179.60X2X33  - 212.96X2  + 1127. 1X3 


(3.92) 
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To  use  the  regression  equation  more  conveniently  in  Fig.  3.14,  the  isotherms  are 
plotted. 


k2co3 


Figure  3.14  Isotherms 


2.  The  space  studied  is  a polyhedron 

With  constraints  on  the  component  concentration  variation,  in  the  general  case,  the 
space  studied  forms  a polyhedron.  In  the  experimental  design  we  should  somehow 
distribute  the  points  over  the  polyhedron  subject  to  the  condition: 


0<a;<  X;<  b;<  1 

(3.93) 

This  excludes  degenerate  cases: 
q q 

E > 1;E&;  <1 

(3.94) 

i=l  i=l 

As  with  all  previous  designs  of  experiments,  the  number  of  design  points  or  trials 
grows  very  rapidly  (power  function)  along  with  the  number  of  factors-components. 
A need  to  reduce  the  number  of  design  points  sets  up  a demand  for  the  remaining 
points  of  a design  of  experiment  to  cover  evenly  the  local  factor  space.  When  the 
number  of  components  is  above  five  (q>5),  the  calculation  of  the  combinations  of 
experimental  conditions  of  possible  design  points  would  include  several  billion 
arithmetic  operations.  Even  the  fastest  IBM  computers  are,  in  such  a situation 
unable  to  do  calculations  before  the  time  necessary  for  physical  performance  of  the 
experiment. 

To  reduce  the  scope  of  calculations  and  to  formalize  the  approach  to  the  choice  of 
design  points  of  a design  of  experiment,  McLean  and  Anderson  [16]  suggested  this 
procedure: 

1.  All  the  possible  combinations  of  the  two  levels  a;  and  b;,  are  put  down  for 
each  and  every  component,  but  in  each  combination  the  content  of  one  com- 
ponent is  omitted.  The  number  of  these  combinations  for  a q-component 
mixture  is  q x 2q'  ; 
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2. 


3. 


4. 


5. 


6. 


7. 


Among  all  the  combinations  those  are  selected  whose  sum  of  components  is 
less  than  one  and  that  meet  the  limitations  of  Eq.  (3.93).  Into  the  combina- 
tions selected  the  omitted  components  are  added  in  amounts  defined  by  the 
relationship  2Xf=l.  The  design  points  thus  obtained  and  satisfying  Eq.  (3.93) 
lie  at  vertices  of  the  bounding  polyhedron; 

To  the  design  points  obtained  are  added  center  points  (centroids)  of  two-, 
three-,  ...,  and  (q-1) -dimensional  faces  of  the  polyhedron  and  its  center  point. 
Coordinates  of  a central  point  are  determined  by  taking  average  coordinates 
of  previously  chosen  vertices; 

Distances  between  vertices  and  center  of  polyhedron  are  calculated  by: 

0.5 

(3.95) 


4 = 


4 

E 

r=l 


X.  -X. 

ir  jr 

br—ar 


The  point  with  maximal  distance  from  polyhedron  center-point  1,  is  inserted 
into  the  design; 

The  distance  between  the  chosen  point  1 and  other  polyhedron  vertices  is  de- 
termined, and  the  farthest-away  vertex  becomes  part  of  design  of  the  experi- 
ment as  point  2; 

The  normed  distance  djj  is  accepted.  The  size  of  the  normed  distance  is 
smaller  if  the  number  of  points,  necessary  to  include  into  the  design  of 
experiment,  increases.  The  distance  recommended  is: 


,SR 

»c 


< 4 ^ ( 


2d?)°'5 


(3.96) 


where: 

dc  is  the  average  distance  of  vertex  from  center; 


8.  All  previously  included  points  are  omitted  from  the  design  if  their  distances 
from  points  1 and  2 is  smaller  than  the  accepted  norm; 

9.  From  the  remaining  points,  the  remotest  point  from  the  center  (3.96)  is 
included  into  the  design. 


This  was  the  way  to  select  vertices,  or  their  coordinates  of  the  local  factor  space.  It 
should  be  noted  that  those  are  pseudocomponent  coordinates. 


Example  3.15  [16] 

Consider  the  building  of  McLean  and  Anderson’s  design  for  the  investigation  and 
optimization  of  luminance  of  luminous  mixtures,  whose  components  are:  Xrmag- 
nesium;  X2-soda;  X3-strontium  nitrate  and  X4-binder.  The  mixture  composition  is 
subjected  to  the  following  constraints: 

0.40<X!<  0.60;  0.10<X2<  0.50;  0.10<X3<  0.50;  0.03<X4<  0.08 

Table  3.31  summarizes  all  the  possible  combinations  of  mixture  composition 
with  one  of  the  components  missing.  Accordingly,  eight  design  points,  i.e.  polyhe- 
dron vertices  Fig.  3.15  are  obtained. 
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(0,1, 0,0) 


Figure  3.15  Local  factor  space 


Table  3.31  McLean-Anderson  design 


N 

Component  content 

N 

Component  content 

x, 

x2 

x, 

x4 

x. 

x2 

X3 

x4 

1 

0.40 

0.10 

0.10 

- 

17  (1) 

0.40 

0.10 

0.47* 

0.03 

2 

0.40 

0.10 

0.50 

18(2) 

0.40 

0.10 

0.42* 

0.08 

3 

0.40 

0.50 

0.10 

- 

19 

0.40 

0.50 

- 

0.03 

4 

0.40 

0.50 

0.50 

- 

20 

0.40 

0.50 

- 

0.08 

5 

0.60 

0.10 

0.10 

- 

21  (3) 

0.60 

0.10 

0.27* 

0.03 

6 

0.60 

0.10 

0.50 

- 

22(4) 

0.60 

0.10 

0.22* 

0.08 

7 

0.60 

0.50 

0.10 

- 

23 

0.60 

0.50 

- 

0.03 

8 

0.60 

0.50 

0.50 

- 

24 

0.60 

0.50 

- 

0.08 

9(5) 

0.40 

0.47* 

0.10 

0.03 

25 

- 

0.10 

0.10 

0.03 

10(6) 

0.40 

0.42* 

0.10 

0.08 

26 

- 

0.10 

0.10 

0.08 

11 

0.40 

- 

0.50 

0.03 

27 

- 

0.10 

0.50 

0.03 

12 

0.40 

- 

0.50 

0.08 

28 

- 

0.10 

0.50 

0.08 

13  (7) 

0.60 

0.27* 

0.10 

0.03 

29 

- 

0.50 

0.10 

0.03 

14  (8) 

0.60 

0.22* 

0.10 

0.08 

30 

- 

0.50 

0.10 

0.08 

15 

0.60 

- 

0.50 

0.03 

31 

- 

0.50 

0.50 

0.03 

16 

0.60 

- 

0.50 

0.08 

32 

- 

0.50 

0.50 

0.08 

* Amount  of  component  added 


These  points  are  to  be  supplemented  by  the  coordinates  of  center  points  of  all  the 
polyhedron  faces,  Table  3.32.  The  coordinates  of  the  polyhedron  center  point  are 
found  by  averaging  appropriate  coordinates  of  all  the  eight  design  vertices;  and  the 
centroid  coordinates  of  faces  by  averaging  the  coordinates  of  the  points  belonging  to 
the  face,  Table  3.32. 
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Table  3.32  Selection  of  face  center  points  in  McLean  and  Anderson’s  design 


N 

X, 

Component  content 
X2  X3 

X4 

Points  of  face 

(9) 

0.50 

0.10 

0.345 

0.055 

(1) 

(2) 

(3) 

(4) 

(10) 

0.50 

0.345 

0.10 

0.055 

(5) 

(6) 

(7) 

(8) 

(11) 

0.40 

0.2725 

0.2725 

0.055 

(1) 

(2) 

(5) 

(6) 

(12) 

0.60 

0.1725 

0.1725 

0.055 

(3) 

(4) 

(7) 

(8) 

(13) 

0.50 

0.2350 

0.2350 

0.030 

(1) 

(3) 

(5) 

(7) 

(14) 

0.50 

0.2100 

0.2100 

0.080 

(2) 

(4) 

(6) 

(8) 

(15) 

0.50 

0.2225 

0.2225 

0.055 

Polyhedron  center  point 

For  the  quaternary  mixture,  the  design  of  McLean  and  Anderson,  together  with  the 
experimental  results,  is  provided  in  Table  3.33. 


Table  3.33  McLean-Anderson  design  for  quaternary  mixture 


N 

x, 

X2 

x3 

X4 

y 

N 

Xi 

X2 

x3 

x4 

y 

1 

0.40 

0.10 

0.47 

0.03 

75 

9 

0.50 

0.10 

0.345 

0.055 

220 

2 

0.40 

0.10 

0.42 

0.08 

180 

10 

0.50 

0.345 

0.10 

0.055 

200 

3 

0.60 

0.10 

0.27 

0.03 

195 

11 

0.40 

0.2725 

0.2725 

0.055 

190 

4 

0.60 

0.10 

0.22 

0.08 

300 

12 

0.60 

0.1725 

0.1725 

0.055 

310 

5 

0.40 

0.47 

0.10 

0.03 

145 

13 

0.50 

0.235 

0.235 

0.030 

200 

6 

0.40 

0.42 

0.10 

0.08 

230 

14 

0.50 

0.210 

0.210 

0.080 

410 

7 

0.60 

0.27 

0.10 

0.03 

220 

15 

0.50 

0.2225 

0.2225 

0.055 

425 

8 

0.60 

0.22 

0.10 

0.08 

350 

The  coefficients  of  the  reduced  second-degree  polynomial  are  found  by  the  meth- 
od of  least  squares.  Here  the  regression  equation  will  be: 

y = -l.558.Xj  - 2.851X,  - 2.426X,  + 14.372X4  + 8.300X3X2  + 8.076X3X3 

— 6.625XjX4  + 3.213X2X3  - 16.998X2X4  - 17.127X3X4  (3.97) 

As  the  dependence  of  the  property  on  components  is  described  adequately  by  the 
second-order  regression  equation,  the  possibility  presented  itself  to  find  optimal 
conditions  through  the  use  of  nonlinear  programming.  Subject  to  the  restrictions  of 
Eq.  (3.94),  the  conditions  providing  the  maximum  luminance  are  found  to  be: 

ymax  = 397.48  for  X1=0.5233;  X2=0.2299;  X3=0.1608;  X4=0.080. 

As  the  number  of  mixture  components  increases,  the  number  of  points  in  the 
design  of  McLean-Anderson  grows  rapidly.  A reduction  in  the  number  of  observa- 
tions may  be  achieved  by  eliminating  some  of  the  face-center  points,  or  by  the  elim- 
ination of  points  that  do  not  jeopardize  the  rest  being  distributed  over  the  space 
under  study  more  or  less  uniformly. 
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Example  3.16  [17] 

In  a study  of  ballistic  properties  of  a three-modal  composite  rocket  propellant,  the 
effect  of  coarse,  medium  and  fine  fractions  of  ammonium  perchlorate  on  burning 
rate  at  70  bar  and  25  °C  has  been  mathematically  modeled.  Limitations  were 
imposed  on  the  ratios  of  all  three  granulations  of  ammonium  perchlorate: 
xrfme  fraction  AP-7  [xm=0.3-0.7; 
x2-coarse  fraction  AP-400  pm=0.0-0.40; 
x3-medium  fraction  AP-200  [xm=0.30-0.70. 

Geometric  interpretation  of  the  local  factor  space  is  given  in  Fig.  3.16. 


Relations  between  real  and  coded  ratios  of  ammonium  perchlorate  fractions  are 
given  by  the  relations: 

( x1  = 0.70X1  + 0.30X2  + O.3OX3 

l x2  = 0.40X2  (3.98) 

[ x3  = 0.30Xj  + 0.30X2  + 0.70X3 

where: 

X;  (i=l,  2,  3)-are  real  ratios  of  i-th  fraction; 

X;-are  coded  ratios  of  i-th  fraction. 

The  design  matrix  has  been  defined  in  accord  with  the  theory  of  extreme  vertices 
designing  of  experiments  Table  3.34. 
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Table  3.34  Extreme  vertices  design 


No. 

Design  matrix 

Response 

marks 

Operational  matrix 

Response 

y 

X, 

X2 

x3 

Xl 

*2 

x3 

1 

1 

0 

0 

yi 

70.00 

0.00 

30.00 

3.92 

2 

0 

1 

0 

y 2 

30.00 

40.00 

30.00 

4.75 

3 

0 

0 

1 

yi 

30.00 

0.00 

70.00 

4.95 

4 

1/3 

2/3 

0 

yuz 

43.33 

26.66 

30.00 

4.25 

5 

1/3 

0 

2/3 

ym 

43.33 

0.00 

56.66 

4.44 

6 

0 

1/3 

2/3 

y233 

30.00 

13.13 

56.66 

4.82 

7 

2/3 

1/3 

0 

ym 

56.66 

13.13 

30.00 

3.84 

8 

2/3 

0 

1/3 

yn3 

56.66 

0.00 

43.33 

4.24 

9 

0 

2/3 

1/3 

Y223 

30.00 

26.66 

43.33 

5.06 

10 

1/3 

1/3 

1/3 

yu3 

43.33 

13.33 

43.33 

4.27 

11* 

1/2 

1/4 

1/4 

yim 

50.00 

10.00 

40.00 

4.09 

12* 

1/4 

1/4 

1/2 

yi223 

40.00 

10.00 

50.00 

3.75 

13* 

3/4 

1/4 

0 

ynu 

60.00 

10.00 

30.00 

3.83 

14* 

3/4 

0 

1/4 

yim 

60.00 

0.00 

40.00 

4.04 

15* 

1/4 

3/4 

0 

yi222 

40.00 

30.00 

30.00 

4.56 

* Control  points 


The  design  matrix  has  been  defined  for  a third-order  regression  model,  as  pre- 
vious research  proved  that  such  a model  may  adequately  describe  experimental  out- 
comes. Regression  coefficients  are  determined  from  the  relations: 

P,  = Yi  =>  Pi  = Yi  = 3-92;  P2  = y2  = 4.75;  (3 3 = y3  = 4.95; 

P9  = l {yuJ  + y®  ~Yi~  yi)  ^ = _1'31;  Pu  = “°'43;  P23  = “°-4’ 

yij  = \ (3y»J  - 3yu/  ~yi  + yj)  ^ yi2  = _0-90;  Vi3=°-97;  723  = 2-07; 

Pyfe  = 27Yijk  - 7^  (yuj  + Yijj  + y,j-fc  + Fifefc  + Yjjk  + Yjkk ) + ^ (yi  + 7/  + 7fe) ! 

P123  = -331- 

The  regression  model  has  the  form: 

y = 3.92Xx  + 4.75X2  + 4.95X3  - 1.31XxX2  - 0.43XjX3  - 0.4X2X3 

— 0.90X1X2(X1  - X2)  + 0.97X1X3(X1  - X3)  + 2.07X2X3(X2  - X3)  (3.99) 

-331X^2X3 

It  should  be  noted  that  pseudocomponents  or  coded  factors  appear  in  the  regres- 
sion model.  A check  of  lack  of  fit  of  the  regression  model  in  control  points  has 
shown  that  the  regression  model  is  adequate  with  95  % confidence. 
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Example  3.17  [18] 

A researcher’s  objective  is  to  the  establish  optimal  composition  of  a composite 
rocket  propellant  by  ballistic  properties  such  as  burning  rate  and  specific  impulse. 
To  achieve  this,  an  extreme  vertices  design  of  experiment  has  been  set  up  for  these 
three  components  of  the  composition: 

xrammonium  perchlorate  65-77  % 

x2-aluminum  powder  8-20  % 

x3-polyurethane  binder  15-27  % 

The  geometric  interpretation  of  the  local  factor  space  is  given  in  Fig.  3.17. 
Relations  between  the  coded  and  real  ratios  are  given  as  follows: 

( x1  = 0.77X1  + 0.65X2  + 0.65X} 

l x2=  0.08-Xj  + 0.20X,  + 0.08X,  (3.100) 

[x3  = 0.15XJ  + 0.15X2  + O.27X3 

The  extreme  vertices  design  for  a third-order  regression  model  is  given  in  Table 
3.35.  Regression  coefficients  for  the  impulse  and  burning  rate  at  pressure  of  70  bar 
and  temperature  of  25  °C  are  determined  from  experimental  outcomes. 


Table  3.35  Extreme  vertices  design 


No. 

Design  matrix 

Response 

marks 

Operational  matrix 

Measur.  response 

Pred.  Response 

Xr 

X2 

X3 

*1 

*2 

*3 

Y, 

y2 

Yi 

y2 

1 

1 

0 

0 

yi 

77.0 

8.0 

15.0 

2275.1 

1.40 

2275.10 

1.400 

2 

0 

1 

0 

h 

65.0 

20.0 

15.0 

2200.0 

1.11 

2200.00 

1.110 

3 

0 

0 

1 

y3 

65.0 

8.0 

27.0 

1853.5 

0.54 

1853.50 

0.540 
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Table  3.35  (continued) 


No. 

Design  matrix 

Response 

marks 

Operational  matrix 

Measur. 

response 

Pred.  Response 

X, 

X2 

x3 

*1 

x2 

x3 

Yi 

Y2 

Yi 

y2 

4 

1/3 

2/3 

0 

yt22 

69.0 

16.0 

15.0 

2255.5 

1.15 

2232.57 

1.139 

5 

1/3 

0 

2/3 

yi33 

69.0 

8.0 

23.0 

2030.0 

0.75 

2009.13 

0.742 

6 

0 

1/3 

2/3 

y233 

65.0 

12.0 

23.0 

1961.3 

0.67 

1941.87 

0.663 

7 

2/3 

1/3 

0 

ym 

73.0 

12.0 

15.0 

2265.3 

1„24 

2242.57 

1.229 

8 

2/3 

0 

1/3 

yiu 

73.0 

8.0 

19.0 

2128.0 

0.90 

2148.26 

0.896 

9 

0 

2/3 

1/3 

Jits 

65.0 

16.0 

19.0 

2098.6 

0.62 

2077.37 

0.618 

10 

1/3 

1/3 

1/3 

yi23 

69.0 

12.0 

19.0 

2255.5 

0.93 

2230.27 

0.921 

11* 

1/2 

1/4 

1/4 

yii23 

71.0 

11.0 

18.0 

2118.2 

1.00 

2266.73 

1.01 

12* 

1/4 

1/4 

1/2 

Yl233 

68.0 

11.0 

21.0 

1980.9 

0.66 

2172.36 

0.87 

13* 

3/4 

1/4 

0 

ym2 

74.0 

11.0 

15.0 

2196.7 

1.17 

2265.97 

1.27 

14* 

3/4 

0 

1/4 

yim 

74.0 

8.0 

18.0 

2137.8 

0.93 

2155.19 

0.98 

15* 

1/4 

3/4 

0 

yi222 

68.0 

17.0 

15.0 

- 

1.08 

2247.70 

1.14 

* Control  points 


1.  Third-order  model  for  specific  impulse 

Pj  = 2275.1;  P2  = 2200.0;  P3  = 1853.5;  P12  = 102.83;  P13  = 66.15; 

P23  = 14.40;  y12  = -102.83;  y13  = -287.10;  y23  = 147.15;  P123  = 3390.98. 
y = 2275.  IXj  + 2200.0X,  + 1853. 5X3  + 102.83XJX,  + 66.15X,X3 
+14.40X2X3  - 102.83X1X2(X1  - X2)  - 287.10XjX3 (X3  -X3) 

+147.15X2X3(X2  - X3)  + 3390.98XJX2X3  (3.101) 

1.1  Check  of  lack  of  fit 

According  to  preliminary  information  it  is: 

1 N , n 

s2y  = „ £ 4 = 156.06;  sj  = — £ (y*  - f)2; 

^ u= 1 n 1 k=  1 

N=2;  f =N(u-l)=2(2-l)=2;  a=0.05;  7=4;  n=2. 

a)  for  control  point  11  it  is: 

i = £ 4 + £ 4;  ai  = xd2xi  ~ !);  = ^x,, 

flx  = 0.5(2  x 0.5  — 1)  = 0.0;  a12  = 4 x 0.5  x 0.25  = 0.5; 

a2  = 0.5(2  x 0.25  - 1)  = -0.125;  a13  = 4 x 0.5  x 0.25  = 0.5; 

a3  = 0.5(2  x 0.25  - 1)  = -0.125;  o23  = 4 x 0.25  x 0.25  = 0.25; 

1 = 0.59;  Ayn  = |yn  - yn 


2118.2  - 2266.7  = 148.5 


_ AyllV/n 

/l.f  — 


148.5\/2 


/V  “ S?yi+|  “ 12.49v^+059  “ 13'3°  ^ tr(°01i2) 
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= 9.92. 


The  regression  model  (3.101)  is  not  adequate  in  point  number  11. 
b)  for  control  point  15  it  is: 

ax  = 0.75(2  x 0.75  - 1)  = 0.38;  a12  = 4 x 0.75  x 0.0  = 0.0; 

«2  = 0.05(2  x 0.0  - 1)  = 0.0;  aX3  = 4 x 0.75  x 0.25  = 0.75; 

a3  = 0.25(2  x 0.25  - 1)  = -0.125;  a23  = 4 x 0.0  x 0.25  = 0.00; 

| = 0.72;  AyX5  = |yX5  - y 15 1 = 2137.8  - 2155.19  = 17.39. 


— 


17.39^ 

12.49\/l+0.72 


= 1.50  x t. 


T(0.01;2) 


= 9.92 


Table  3.36  Check  of  lack  of  fit  of  regression  - Eq.  (3.101) 


No. 

Coded  ratios 
Xx  X2  X3 

Y 

Y 

|Ay| 

1 

tT 

adequacy 

11 

0.50 

0.25 

0.25 

2118.2 

2266.7 

148.50 

0.59 

13.30 

9.92 

Inadequate 

12 

0.25 

0.25 

0.50 

1980.9 

2172.3 

191.46 

0.59 

17.15 

9.92 

Inadequate 

14 

0.25 

0.75 

0.00 

2196.7 

2265.9 

69.27 

0.72 

5.97 

9.92 

Adequate 

15 

0.75 

0.00 

0.25 

2137.8 

2155.2 

17.39 

0.72 

1.50 

9.92 

Adequate 

The  regression  model  is  adequate.  A check  of  lack  of  fit  of  regression  model 
(3.101)  in  all  control  points  is  given  in  Table  3.36. 

2.  Third-order  model  for  burning  rate 

Px  = 1.40;  p2  = 1.11;  p3  = 0.54;  P|2  = -0.27;  P13  = -0.65; 

P23  = —0.81;  P123  = 2.86;  yu  = —0.045;  y13  = —0.923;  y23  = —1.62. 

y = 1.40XJ  + 1.11X2  + 0.54X3  - 0.27XxX2  - 0.65XxX3  - 0.81X2X3 

— 0.045XxX2(Xx  -X2)  - 0.923XxX3(Xx  -X3)  - 1.62X2X3(X2  - X3)  + 2.86XxX2X3 

(3.102) 

A check  of  lack  of  fit  of  the  obtained  regression  model  is  given  in  Table  3.37. 
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Table  3.37  Check  of  lack  of  fit  of  regression  - Eq.  (3.1 02) 


No. 

Coded  ratios 
Xn  X2  X3 

y 

Y 

|Ay| 

tT 

adequacy 

11 

0.50 

0.25 

0.25 

1.00 

1.01 

0.01 

0.59 

0.165 

9.92 

Adequate 

12 

0.25 

0.25 

0.50 

0.66 

0.87 

0.21 

0.59 

3.45 

9.92 

Adequate 

13 

0.25 

0.75 

0.00 

1.08 

1.14 

0.06 

0.70 

0.950 

9.92 

Adequate 

14 

0.75 

0.25 

0.00 

1.17 

1.27 

0.10 

0.72 

1.580 

9.92 

Adequate 

15 

0.75 

0.00 

0.25 

0.93 

0.98 

0.05 

0.72 

0.790 

9.92 

Adequate 

The  geometric  interpretation  in  the  form  of  contour  graphs  for  both  regression 
models  is  given  in  Figs.  3.18  and  3.19. 

By  overlapping  the  simplex  with  specific  impulse  and  burning  rate  contour  lines, 
we  can  determine  the  optimal  composition  of  a composite  rocket  propellant  in  a 
very  simple  way. 


Figure  3.18  Extreme  vertices  design  of  impulse  contour  lines  Ns/kg 
Y 3 


X2  aluminum 
coded  factors 


Figure  3.19  Extreme  vertices  design  of  burning  rate  contour 
lines  cm/s 
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3.6 

D-optimal  Designs 

The  most  important  among  the  known  criteria  of  design  optimality  is  the  require- 
ment of  D-  and  G-optimality.  A design  is  said  to  be  D-optimal  when  it  minimizes 
the  volume  of  the  scatter  ellipsoid  for  estimates  of  regression  equation  coefficients. 

The  property  of  G-optimality  provides  the  least  maximum  variance  of  predicted 
response  values  in  a region  under  investigation. 

Simplex-lattice  designs  exhibit  the  properties  of  D-  and  G-optimality  in  the  build- 
ing of  second-  and  incomplete  third-degree  polynomials  only.  Schejfe’s  designs  of 
higher  degree  are  not  D-optimal  [19].  A D-optimal  simplex  lattice  for  the  third- 
degree  polynomial  was  deduced  later  by  Kiefer  [20].  If  we  consider  a set  of  designs 
with  the  coordinates  of  points: 

% = i;  Xj  =xk  = o 

Xi  = l-Xj  = b;  Xk  = 0;  b A 1/2  (3.103) 

Xi=Xj=Xk  = 1/3 

then  to  produce  a third-degree  polynomial  a design  will  be  D-optimal  at: 

b=  (l-V5)/2. 

In  the  example  the  points  in  the  faces  of  simplex  are  taken  with  coordinates: 
XpO.2764  and  Xr0.7236  [12]. 

Table  3.38  tabulates  the  D-optimal  design  for  the  derivation  of  a ternary  system 
third-degree  polynomial.  Following  this  design  the  coefficients  are  obtained  for  a 
third-degree  polynomial  having  the  same  form  as  that  from  a conventional  simplex 
lattice: 

Table  3.38  D-optimal  design  for  a ternary  system  third-degree  polynomial  {3,3} 


N 

X, 

x2 

x3 

y 

N 

xn 

x2 

X3 

Y 

1 

1 

0 

0 

yi 

6 

0.7236 

0 

0.2764 

yn3 

2 

0 

1 

0 

Yi 

7 

0.2764 

0 

0.7236 

yi33 

3 

0 

0 

1 

Ys 

8 

0 

0.7236 

0.2764 

Ym 

4 

0.7236 

0.2764 

0 

Yin 

9 

0 

0.2764 

0.7236 

Yin 

5 

0.2764 

0.7236 

0 

Yin 

10 

0.333 

0.333 

0.333 

Ym 

y = 

E P,x. 

+ E 

P,jXiX;  + 

E 

y..X:X:  | 
y 1 J ' 

( v x > 

V ; J'y 

1+  E 

XiXjXk 

1 <i<q 

1 <H/<y 

1 

\ / 

l<i~<j^(k<q 

(3.104) 

Formulas  for  the  polynomial  coefficients  are  derived  by  substituting  the  coordi- 
nates of  points  into  the  regression  equation: 
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P;  = Yi 

P*  - \ (fy  - Yijj  - Yi  - Yj) 

yv  = I [5  (Yi,J  - YiJi)  ~Yi  + Yj]  (3'105) 

Pp  = 27Yijk  - y (yuj  + Yijj  + Yak  + Yjkk  + Yjjk)  + 6(v;  + Yj  + Yk ) 

The  adequacy  test  and  the  assignment  of  confidence  intervals  using  a D-optimal 
design  (Table  3.38)  are  accomplished  along  the  same  lines,  as  in  the  simplex-lattice 
method.  The  variation  of  § with  composition,  are  given  in  the  reference  literature 
[12].  In  constructing  the  fourth-order  polynomial  for  the  ternary  system,  the  design 
will  be  D-optimal  at: 

= (7  - V2l)/U-Xj  = l-Xi;Xk  = 0 (3.106) 

or 


Xp0.1727;  Xp0.8273;  Xj<=0 

Moreover,  in  the  fourth-order  D-optimal  design  there  are  points  with  coordinates: 
Xi  = Xj  = (7  - Vs)/22;Xk  = 1 - (x(  + Xj)  (3.107) 

or 

XPXj=0.2165;  X, -0.5670 

In  Table  3.39  a fourth-order  D-optimal  design  for  a ternary  system  is  presented. 
Table  3.39  D-optimal  design  for  a ternary  system  fourth-degree  polynomial  {3,4} 


N 

Xt 

X2 

x3 

y 

N 

x3 

X2 

x3 

Y 

1 

1 

0 

0 

yi 

9 

0.8273 

0 

0.1727 

yim 

2 

0 

1 

0 

Yi 

10 

0.1727 

0 

0.8273 

Yl333 

3 

0 

0 

1 

in, 

11 

0 

0.8273 

0.1727 

Y2223 

4 

0.5 

0.5 

0 

yi2 

12 

0 

0.1727 

0.8273 

Y2333 

5 

0.5 

0 

0.5 

yu 

13 

0.5670 

0.2165 

0.2165 

Yll23 

6 

0 

0.5 

0.5 

in 

14 

0.2165 

0.5670 

0.2165 

Yl223 

7 

0.8273 

0.1727 

0 

yim 

15 

0.2165 

0.2165 

0.5670 

Yl233 

8 

0.1727 

0.8273 

0 

yi222 

According  to  this  design  the  coefficients  are  obtained  for  a regression  equation  of 
the  form: 

Y = PiXi  + P2X2  + P3Xj  + P12XjX2  + P13XjX3  + p23x2x3  + y12X1X2(X1  - X2) 
+Yl3X1X3(X1  -X3)  +y23X2X3(X2  -X3)  + 612X1X2(X1  — x2)2 
+S13  Xj  X3  (Xj  — x3 ) 2 

+b23X2X3(X2  - X3)2+PU23XiX2X3  + P1223X1X22X3  + P1233XjX2X2  (3.108) 
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Substituting  the  coordinates  of  points  into  the  regression  equation  gives  the  rela- 
tionships for  calculation  of  the  coefficients  of  the  fourth-degree  polynomial: 


hi  — Yi 

(3.109) 

bij  = % - lYi  - 2y 

i 

(3.110) 

1 

pH 

+ 

(3.111) 

1 - 8yy  + 7(y»y 

(3.112) 

Pgk  = 26  657fi  - 6-167 (yj  + y,)  - 16.96  (y--  + ya)  + 0.51% 

-32.18^-  + yiiit)  + 17. 196  ^y^-  + yikkk ) + 5.72  (y^,  + yjkkk ) 

+84-1%,-  23.237 (y#+y^)  (3.113) 

i%k;  i,  j,  k=l,  2,  3 

Figure  3.20  shows  the  arrangement  of  points  in  D-optimal  designs  for  ternary  sys- 
tems. 

x2  x2  x2  x2 


Figure  3.20  Arrangement  of  points  in  the  D-optimal  designs  of: 
a)  second-order;  b)  incomplete  third-order;  c)  third-order;  d) 
fourth-order 


Example  3.18  [12] 

The  variation  of  viscosity  (y)  of  solutions  in  the  system  (NH4)2HP04-K2C03-H20 
with  composition  and  temperature  is  studied.  The  experimental  design  is  conducted 
within  a local  region  of  the  concentration  triangle  bounded  by  a saturation  line  at 
0°C  (Figs.  3.21  and  3.22).  The  local  region  of  the  diagram  was  a triangle  with  the 
vertices:  Zx(42;  0;  58),  Z2(0;  30;  70)  and  Z3(0;  0;  100). 

The  third-order  D-optimal  design  is  prepared  relative  to  pseudocomponents  Zx, 
Z2  and  Z3;  and  the  content  of  initial  components  at  the  design  points  is  determined 
by  Eq.  (3.84).  Table  3.40  presents  the  experimental  conditions  both  in  terms  of  pseu- 
docomponents and  on  the  natural  scale  (per  cent).  The  sample  variance  here  is: 
Sy=0.53;  and  the  number  of  degrees  of  freedom  is  f =13.  From  Eq.  (3.105)  for  viscosi- 
ty at  0 °C  the  coefficients  have  been  calculated  for  the  third-order  regression  equa- 
tion: 
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y1  = 8.33 Zj  + 4.99 Z2  + 1.79Z3  - 6.95Z,  Z2  - 9.05 ZXZ3  - 1.37Z2Z3 

+17.90Z1Z2(Z1  - Z2) + 9.90Z1Z3(Z1  - Z3)  + 12.37Z2Z3(Z2  -Z3) 
+18.06ZxZ2Z3  (3.114) 

and  at  30  °C: 

y2  = 3.83ZX  + 2.54Z2  + 0.80Z3  - 8.77ZXZ2  - 3.10Z1Z3  - 0.87Z2Z3 
+5.27Z1Z2(Z1  - Z2)  +6.55Z1Z3(Z1  - Z3)  + 5.77Z2Z3(Z2  - Z3) 
+3.00ZjZ2Z3  (3.115) 


Table  3.40  D-optimal  design  {3,3} 


N 

ZX 

z2 

z3 

*1 

x2 

x3 

yi 

y 2 

1 

1 

0 

0 

42.0 

0 

58.0 

8.33 

3.83 

2 

0 

1 

0 

0 

30.0 

70.0 

4.99 

2.54 

3 

0 

0 

1 

0 

0 

100.0 

1.79 

0.80 

4 

0.2764 

0.7236 

0 

11.6 

21.71 

66.69 

4.22 

2.09 

5 

0.7236 

0.2764 

0 

30.4 

8.29 

61.31 

6.32 

2.77 

6 

0.2764 

0 

0.7236 

11.6 

0 

88.4 

2.20 

1.13 

7 

0.7236 

0 

0.2764 

30.4 

0 

69.6 

4.30 

2.26 

8 

0 

0.2764 

0.7236 

0 

8.29 

91.71 

2.30 

1.09 

9 

0 

0.7236 

0.2764 

0 

21.71 

78.29 

3.93 

1.90 

10 

0.333 

0.333 

0.333 

14.0 

10.0 

76.0 

3.59 

1.64 

11* 

0.22 

0.22 

0.56 

9.1 

6.5 

84.4 

2.00 

1.23 

12* 

0.22 

0.56 

0.22 

9.1 

17.0 

73.9 

3.68 

1.82 

13* 

0.56 

0.22 

0.22 

23.9 

6.5 

69.6 

4.70 

2.12 

* Control  points 

The  results  of  testing  the  adequacy  of  Eqs.  (3.114)  and  (3.115)  ; 
3.41. 

are  arrayed  in  Table 

Table  3.41 

Check  of  lack  of  fit 

N 

yi 

Vl 

|Ay, 

y2 

|Ay2| 

1 

*R1 

*R2 

11 

2.0 

1.68 

0.32 

1.23 

0.72 

0.51 

0.8 

0.77 

1.22 

12 

3.68 

3.68 

0 

1.82 

1.82 

0 

0.8 

0 

0 

13 

4.70 

5.70 

1.00 

2.12 

2.59 

0.47 

0.9 

2.33 

1.09 

The  table  value  of  Student’s  t-test  is  t0.oi6;i3=2.85.  For  all  the  test  points  the  values 
of  the  t-ratio  were  found  to  be  less  than  the  table  value,  hence  the  regression  equa- 
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tions  Eq.  (3.114)  and  (3.115)  adequately  fit  the  experiment  with  95%  level  of  confi- 
dence. Relations  between  pseudocomponents  and  natural  variables  are  as  follows: 

1 = 0.42Zi1)  + OZf 1 + 0.58Zi3) 

0 = 0Z,(1)  + 0.3Zi2)  + 0.7Zi3)  (3.116) 

o = ozix)  + 0 Zi2)  + 1Zi3) 

0 = 0.42Z21’  + 0 Z(2)  + 0.58Z23) 

1 = 0Z23)  + 0.3Z22)  + 0.7Z23) 

0 = 0Z23)  + 0 Zj2)  + iz£3) 


The  solutions  to  the  sets  of  Eq.  (3.116)  are: 


:i0)  = 2.38; 

Z21}  = 0; 

f = 0; 

Zj2)  = 3.33; 

(3.117) 

-I3’  = 0; 

zi3)  = 0 

Substituting  Eq.  (3.117)  into  the  set  of  equations  (3.86),  we  arrive  at: 

Zx  = 2.38(1  — x2  — x3) 

Z2  = 3.33x,  (3.118) 

Z3  = 1 - Z-i  - Z2  = 2.38*3  - 0-95*2  - 1.38 

The  geometric  interpretation  of  the  local  factor  space  and  arrangement  of  the 
points  are  given  in  Figs.  3.21  and  3.22. 


Xi 


(nh4)2hpo4 


Figure  3.21  Local  factor  space 
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Figure  3.22  Arrangement  of  points 


Example  3.19  [12] 

We  seek  to  determine  an  optimal  composition  of  a multicomponent  solvent  utilized 
to  remove  hydrocarbons  from  yeast.  The  major  index  of  purification  here  is  the 
hydrocarbon  content  in  biomass  upon  extraction  (y).  For  technological  and  eco- 
nomic reasons,  the  experimental  design  is  accomplished  in  a local  section  of  the 
concentration,  Fig.  3.23. 

In  the  region  covered,  the  mixture  contains,  in  per  cent:  acetones,  X3<  74;  hexane, 
X2<  90;  and  water,  X3<  10.  The  local  portion  of  the  diagram  is  a triangle  with  the 
vertices:  Z,  (9.5;  89.5;  1);  Z2  (58.5;  40;  1.5);  Z3  (74;  16;  10). 


X2 
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A fourth-order  D-optimal  design  is  produced  with  reference  to  pseudocompo- 
nents Zj;Z2  and  Z3  - Table  3.42.  The  pseudocomponents  satisfy  the  principal  condi- 
tion for  Scheffe’s  designs.  The  conversion  to  initial  components  at  any  point  within 
the  local  simplex  studied  is  carried  out  from  Eq.  (3.84).  According  to  this  design,  an 
experiment  is  run  with  mixtures,  each  observation  being  repeated  twice.  Using 
Eqs.  (3.109)-(3.113)  the  coefficients  of  fourth-order  regression  equation  are  calcu- 
lated in  pseudocomponents 


-s) 

II 

O.lZj  +0.3Z2  + 0.04Zj 

— 0.48ZxZ2 

N 

p 

d 

1 

Zj-' 

0.44Z2Z3 

+0.914ZJZ2I 

'7-x  - z2)  - 

- 0.312ZJZ3 

(Zi- 

z3) 

-1.39  Z2Z3(Z2 

-z3) 

-1.003Z1Z1( 

'7-x  - z2)2 

+ 

O 

VJ 

> 

3 (Zl  - 

-z3 

)2  +0.782Z2Z3( 

z2-z3)2 

+1.398Z2Z2; 

Z3  + %A\6Z1Z%Z3  -4. 

.703Zj 

Z22 

,2 

j3 

(3.119) 

Table  3.42 

D-optimal  design  {3.4} 

N 

z3 

Z 2 

Z3 

Xl 

X2 

x3 

y 

1 

i 

0 

0 

9.5 

89.5 

1 

0.1 

2 

0 

1 

0 

58.5 

40.0 

1.5 

0.3 

3 

0 

0 

1 

74.0 

16.0 

10.0 

0.04 

4 

0.5 

0.5 

0 

34.0 

64.7 

1.3 

0.08 

5 

0.5 

0 

0.5 

41.7 

52.8 

5.5 

0.06 

6 

0 

0.5 

0.5 

66.2 

28.2 

5.8 

0.06 

7 

0.176 

0.824 

0 

49.9 

48.7 

1.4 

0.05 

8 

0.824 

0.176 

0 

18.12 

80.79 

1.09 

0.09 

9 

0.176 

0 

0.824 

62.6 

29.0 

8.4 

0.12 

10 

0.824 

0 

0.176 

20.85 

76.55 

2.6 

0.1 

11 

0 

0.176 

0.824 

71.22 

20.30 

8.49 

0.2 

12 

0 

0.824 

0.176 

61.25 

35.75 

3.0 

0.11 

13 

0.216 

0.216 

0.568 

56.7 

37.12 

6.18 

0.11 

14 

0.216 

0.568 

0.216 

51.2 

45.65 

3.15 

0.091 

15 

0.568 

0.216 

0.216 

34.0 

62.97 

3.03 

0.11 

16 

0.333 

0.333 

0.333 

47.3 

48.5 

4.2 

0.108 

The  sets  of  Eq.  (3.87)  under  the  constraints  on  the  component  content  in  the  sol- 
vent have  the  form: 


9.5  z[1]  +89.5  Zi] 


+ 1 Zi]  = 1 


58.  SZ^  + 40  Zj2)  + 1.5Zi3)  = 0 
74  Z^  + 16Zi2)  + 10Zi3)  = 0 

9.5  Z2]  + 89.5  Z{22)  + iz!3)  = 0 
58.5Z21'  + 40Z22)  + 1.5Z23)  = 1 
74  Zj1’  + 16Z22)  + 10Z23)  =1 


(3.120) 
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The  solutions  are: 

Zi]  = 0.0092;  Z(2]  = 0.0215; 

z[2)  = 0.0116;  Z{2]  = 0.00051; 

z{})  = 0.0495;  Z23)  = -0.1583. 

Using  the  solutions  found  we  obtain  the  formulas  relating  natural  coordinates  x 
to  Z: 

Zx  = —0.92  + 0.0208x2  + 0.058x3  \ 

Z2  = 2.15  - 0.022x2  -O.I8X3  l (3.121) 

Z3  = 1 - Zj  - Z2  = -0.23  + 0.001x2  + 0.121x3  J 

The  adequacy  of  the  regression  equation  obtained  was  tested  by  the  Student’s  test 
at  five  test  points,  the  results  being  given  in  Table  3.43. 


Table  3.43  Check  of  lack  of  fit 


Xi 

x2 

x3 

y 

Y 

tR 

tT 

47.3 

48.5 

4.2 

0.1022 

0.108 

0.365 

2.83 

53.0 

44.0 

3.0 

0.079 

0.072 

0.392 

2.83 

19.0 

79.0 

2.0 

0.1 

0.07 

1.778 

2.83 

37.9 

58.7 

3.4 

0.13 

0.12 

0.547 

2.83 

44.5 

54.0 

1.5 

0.04 

0.05 

0.57 

2.83 

It  is  seen  that  Eq.  (3.119)  adequately  fits  the  experiment  at  the  significance  level 
a=0.05%.  The  quality  of  the  resultant  product  is  considered  satisfactory,  if  the  con- 
tent of  residual  hydrocarbons  in  the  biomass  is  under  0.05%.  With  the  aim  to  eluci- 
date the  solvent  compositions  meeting  this  requirement,  the  lines  of  constant 
response  are  plotted  to  Eq.  (3.119);  the  curves  are  shown  in  Fig.  3.24.  The  solvent 
compositions  meeting  the  requirement  that  y<0.05%,  can  be  found  within  the  area 
of  the  simplex. 
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3.7 

Draper-Lawrence  Design 

Draper  and  Lawrence  [21]  have  proposed  designs  in  which,  unlike  the  simplex  lat- 
tices, all  the  points  are  located  within  the  investigation  area,  i.e.  experiments  are 
conducted  with  the  q-component  mixtures  only.  These  designs  make  allowance  for 
the  absence  of  prior  information  on  the  response  surface  and  that  it  is  desirable  to 
approximate  an  unknown  response  surface  by  low-degree  polynomials.  Data  points 
are  chosen  to  provide  the  best  representation  of  a complex  surface  by  simple  polyno- 
mials. In  building  a polynomial  of  degree  nj,  design  points  are  to  be  selected  so  that 
a minimal  systematic  error  results,  which  occurs  due  to  the  higher  degree  of  the 
response  function  polynomial,  n2  compared  with  the  degree  of  estimating  the  poly- 
nomial, n3.  The  principles  underlying  the  selection  of  suitable  designs  have  been 
put  forward  earlier  by  Box  and  Draper.  Draper  and  Lawrence  built  designs  for  ternary 
and  quaternary  systems  and  polynomials  of  degrees  nj=l;  n2=2;  n3=2  and  n2=3.  To 
make  the  generation  of  design  more  convenient  these  authors  introduce  a new  refer- 
ence system.  With  ternary  mixtures,  the  new  coordinate  system  is  selected  in  the 
plane  of  the  concentration  triangle  (X^  X2,  X3),  so  that  the  origin  coincides  with  the 
centroid  of  the  triangle,  one  of  the  triangle  axes  lies  on  the  axis  Z2  and  the  two  others 
are  symmetrical  about  this  axis,  Fig.  3.25. 
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Z2 


Figure  3.25  Coordinate  system  for  the  designs  of  Draper  and 
Lawrence 


The  triangular  coordinate  system  (X1;  X2,  X3)  is  related  to  the  rectangular  one  (Zlf 
Z2)  as  follows: 


Zi=\(-X i+*i) 

z2=^(-x1-x2+  2X3)  _ 


(3.122) 


X1  = h-3 Z1-Z2V3  + m)  ' 
X2=^(3Z1-Z2V3  + m)  > 
X3  = ^ (2Z2\/3  + m) 


(3.123) 


where: 

m-is  the  length  of  a side  of  the  concentration  triangle. 

When  the  whole  of  the  diagram  is  explored,  m=l;  and  when  local  areas  of  the  dia- 
gram are  explored,  m<l.  Design  points  for  ternary  system  are  selected  (in  the  coor- 
dinates Z1;  Z2)  from  the  following  sets: 


1)  Vertices  of  a triangle  similar  to  a given  concentration  triangle  and  centered 
on  the  origin,  with  side  of  length  p: 


A , 1 

1 V3  1 

1 y/3  ' 

°’W 

[+^TPJ 

2P’-^P\ 

2)  Vertices  of  a triangle  similar  to  a given  concentration  triangle  and  centered 
on  the  origin,  with  side  of  length  q: 


' 1 

1 Vi  1 

1 Vi  ' 

’ Viq. 

72q'^q\ 
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3)  Vertices  of  a square  centered  on  the  origin,  with  sides  2a  parallel  to  axes  (±a, 

±a); 

4)  Points  on  coordinate  axes  (±b,  0),  (0,  ±b); 

5)  Vertices  of  a rectangle  (c,  d),  (-c,  -d),  (c,  -d)  and  (-c,  d). 


After  one  or  the  other  design  of  Draper-Lawrence  has  been  constructed  for  a tern- 
ary system,  first-degree  polynomials  are  derived  in  two  independent  variables  Zj 
and  Z2  (n!=l  for  n2=2). 

y =b0+b1Z1+b2Z2  (3.124) 

or  second-order  polynomials  (nj=2  for  n2=3): 

y = b0  + Z-^  + fo2  Z2  + f>12  Zj  Z2  + b^Zi  + f?22Z2  (3.125) 


Draper  and  Lawrence  suggested  that  for  the  first-order  polynomials  and  ternary  sys- 
tems (q=3),  the  designs  of  experiments  should  containin  from  6 to  9 point-trials.  Param- 
eters for  some  of  Draper-Lawrence  designs  (in  fractions  of  m)  at  c^3,  n2=l  and  n2=2,  are 
given  in  Table  3.44.  If  the  number  of  design  points  is  more  than  that  of  a selected  set, 
then  an  appropriate  number  of  points  is  added  at  the  center  of  triangle  (with  coordinates 
Z1=0,  Z2=0).  For  example,  we  consider  a Draper-Lawrence  design  (1,2)  containing  six 
points,  Table  3.45.  Points  of  set  1 at  m=l  have  the  coordinates  (Z1;  Z2): 


0.621 

~7T. 


0.621 
3 ^ — 


0.621^3 

6 


0.621  0.621^3 
2 ’ 6 


or: 


(0.0;  0.366);  (0.3105;  -0.18);  (-0.3105;  -0.18) 
Points  of  a set  2 have  the  coordinates: 


0.0;- 


0.339' 


+ ‘„.33 

z o 


LW  2122221 

z o 


or: 

(0.0;  -0.196);  (0.170;  0.098);  (-0.170;  0.098) 


Table  3.44  Parameters  of  Draper-Lawrence  designs  for  q=3,  nn  = 1 ,n2  = 2 


Set  (nn,  n2) 

Center  points 

Design  points-total 
N 

Parameters 

(1-2) 

0 

6 

p=0.621 

q=0.339 

(1-2) 

1 

7 

p=0.662 

q=0.381 

(1-2) 

2 

8 

p=0.699 

q=0.421 

(1.2) 

3 

9 

p=0.733 

q=0.457 

(1.3) 

0 

7 

p=0.616 

q=0.160 

(1.4) 

0 

7 

p=0.616 

b=0.226 

(1.5) 

0 

7 

p=0.616  c — 

Vo.OSlrn  - d2* 

(1.1.2) 

0 

9 

p^O.606;  q=0.364 

p2=0.500 

(1.2.2) 

0 

9 

p=0.727;  q2=0.200 

q1=0.425 
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* Value  of  d2  is  selected  at  random 


Table  3.45  Draper-Lawrence  design  matrix  (1 ,2);  for  q=3,  n3  = 1 ,n2  = 2 


N 

z, 

Z2 

X, 

X2 

x3 

1 

0.0 

0.366 

0.12 

0.12 

0.76 

2 

0.311 

-0.18 

0.127 

0.748 

0.125 

3 

-0.311 

-0.18 

0.748 

0.127 

0.125 

4 

0.0 

-0.196 

0.447 

0.447 

0.106 

5 

0.170 

0.098 

0.106 

0.447 

0.447 

6 

-0.170 

0.098 

0.447 

0.106 

0.447 

We  make  a transition  from  point  coordinates  in  the  system  (Z3,  Z2  ) to  those  in 
the  triangle  X3,  X2,  X3  using  Eqs.  (3.123).  Let  us  map,  for  example,  the  first  point 
with  coordinates  Z^O.O  and  Z2=0.366  (m=l).  For  this  point: 

X1  = i 0.366\/3  + l)  = 0.12 

X2  = i (-0.366^  + l)  = 0.12 

X3  = i (2  x 0.366^  + l)  = 0.76 

To  test  the  calculation  we  sum  up: 

Xj+Xj+X^O.  12+0. 12+0.76=1 .0 

The  arrangement  of  the  points  in  the  concentration  triangle  is  shown  in  Fig.  3.26. 
Draper  and  Lawrence  suggested  for  second-order  polynomials  of  Eq.  (3.125)  as 
applied  to  ternary  systems,  the  designs  containing  from  8 to  15  design  points.  Pa- 
rameters for  the  Draper-Lawrence  designs  (in  fractions  of  m)  at  q=3,  n3=2  and  n2=3, 
are  summarized  in  Table  3.46. 


Figure  3.26  Draper-Lawrence  design  (1,2) 
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Table  3.46  Parameters  of  Draper-Lawrence  designs  at  q=3,  n-\  = 2,n2  = 3 


n„  n2,  n„  n3 

"o 

N 

Parameters 

(1.2) 

1 

7 

p=0.670 

q=0.385 

(1.2) 

2 

8 

p=0.698 

q=0.421 

(1.2) 

3 

9 

p=0.723 

q=0.450 

(1.1.2) 

0 

9 

P!=0.715 

p2=0.233 

q=0.430 

(1.1.2) 

1 

10 

p1=0.729 

P2=0-323 

q=0.445 

(1.1.2) 

2 

11 

p1=0.738 

p2=0.398 

q=0.462 

(1.1.2) 

3 

12 

p1=0.743 

p2=0.465 

q=0.450 

(1.1.2) 

4 

13 

Pj=0.742 

P2=0.532 

q=0.485 

(1-2.2) 

0 

9 

p=0.716 

qj=0.342 

q2=0.342 

(1.2.2) 

1 

10 

p=0.739 

qi=0.367 

q2=0.367 

(1.1. 1.2) 

0 

12 

P!=0.751 

p2=0.422 

p, =0.189 

c^0.470 

(1,1,2, 2) 

0 

12 

p1=0.748 

p2=0.445 

q,=0.468 

q2=0.156 

(1,2, 2, 2) 

0 

12 

p=0.782 

q1=0.348 

q2=0.348 

q3=0.348 

(1.3.4) 

2 

13 

p=0.756 

a=0.183 

b=0.258 

(1.3.5) 

2 

13 

p=0.756 

a=0.300 

c=0.547 

d=0.130 

(1.4,5) 

2 

13 

p=0.756 

b=0.212 

c=0.130 

d=0.257 

(1,5,5) 

2 

13 

p=0.756 

c,=0.094 

dj-0.272 

c2=0.172 

d2=0.125 

(1.1.2. 5) 

0 

13 

p1=0.297 

p2=0.756 

q=0.295 

c=0.111 

d=0.268 

(1.1.2. 5) 

0 

13 

Pi=0.478 

p2=0.756 

q=0.477 

c=0.045 

d=0.109 

(1.1.2. 5) 

1 

14 

Pj=0.369 

p2=0.766 

q=0.319 

c=0.112 

d=0.270 

(1,1.2, 5) 

1 

14 

p1=0.514 

p2=0.762 

q=0.481 

c=0.058 

d=0. 140 

(1,1.2, 5) 

2 

15 

p1=0.545 

p2=0.766 

q=0.480 

c=0.071 

d=0.171 

n0-Center  points 


Consider  then  a design  (1,3,4)  containing  13  points:  11  points  of  the  sets  (1,3,4) 
and  2 augmenting  points  at  the  center  of  the  triangle,  Fig.  3.27. 


Figure  3.27  Draper-Lawrence  design  (1,3,4) 
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The  points  of  set  1 at  m=l  have  the  coordinates  (Zi,  Z2): 


0; 


0.756' 


0.756 
“I ^ — 5 


0.756\/3 

6 


0.756  0.756^3 
2 ’ 6 


or: 

(0.0;  0.437);  (0.378;  -0.218);  (-0.378;  -0.218) 

The  points  of  set  3: 

(0.183;  0.183);  (0.183;  -0.183);  (-0.183;  0.183);  (-0.183;  -0.183) 

The  points  of  set  4: 

(0.258;  0);  (-0.258;  0);  (0;  0.258);  (0.0;  -0.258) 

The  experimental  design  is  tabulated  in  Table  3.47. 

The  coordinates  X3,  X2,  X3  are  related  to  Z1;  Z2  by  Eqs.  (3.123).  The  coefficients  of 
the  second-order  regression  equation  y=f  (Z1;  Z2)  are  derived  by  the  method  of  least 
squares.  For  the  adequacy  of  fit,  t-tests  are  applied  to  the  experimental  data  at  test 
points.  The  equation  is  adequate  if  the  tR-test  values  for  all  the  test  points  are  less 
than  the  table  value.  The  tR-test  values  are  to  be  found  from  Eq.  (3.59).  Values  of  f; 
can  be  taken  from  appropriate  contour  charts.  Using  the  designs  of  Draper-Lawrence, 
the  composition  dependence  of  § can  only  be  calculated  with  a digital  computer.  To 
develop  designs  of  quaternary  systems,  Draper-Lawrence  also  introduced  a coordinate 
system  (Z3,  Z2,  Z3).  The  origin  of  the  new  system  coincides  with  the  centroid  of  the 
concentration  tetrahedron  (X3,  X2,  X3,  X4),  and  the  coordinate  axes  are  directed  so  that 
the  four  vertices  of  the  tetrahedron  in  the  new  coordinate  system  form  a half-replica  of 
the  full  factorial  design  23  with  the  defining  contrast  1=Z3Z2Z3.  In  the  new  system  (Z1; 
Z2,  Z3)  the  coordinates  of  the  tetrahedron  vertices  are  in  the  general  case: 

(m,  m,  -m);  (m,  -m,  m);  (-m,  m,  m);  (-m,  -m,  -m) 

and  for  the  tetrahedron  edge  m=l: 

(1, 1,  -1);  (1,  -1, 1);  (-1, 1, 1);  (-1,  -1,  -1) 
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Table  3.47 

Design 

i matrix  (1 ,3,4);  q 

=3;  n,  = 

2,n2  = 3, 

N 

Z, 

Z2 

x, 

X2 

x3 

1 

0.0 

0.437 

0.081 

0.081 

0.838 

2 

0.378 

-0.218 

0.081 

0.837 

0.082 

3 

-0.378 

-0.218 

0.837 

0.081 

0.082 

4 

0.183 

0.183 

0.044 

0.410 

0.546 

5 

0.183 

-0.183 

0.256 

0.622 

0.122 

6 

-0.183 

0.183 

0.410 

0.045 

0.545 

7 

-0.183 

-0.183 

0.622 

0.256 

0.122 

8 

0.258 

0.0 

0.076 

0.591 

0.333 

9 

-0.258 

0.0 

0.592 

0.075 

0.333 

10 

0.0 

0.258 

0.184 

0.184 

0.632 

11 

0.0 

-0.258 

0.482 

0.482 

0.036 

12 

0.0 

0.0 

0.333 

0.333 

0.333 

13 

0.0 

0.0 

0.333 

0.333 

0.333 

The  coordinate  systems  (Xx,  X2,  X3,  X4)  and  (Zj,  Z2,  Z3),  are  related  to  each  other 
as  follows: 

Zx  =Xj  +x2  — x3  — x4 

Z2  = Xj  - X2  + X3  - X4  (3.126) 

Z3  = -Xx  + X2  + X3  - X4 

and 

Xi  = i (Zx  + Z2  — Z3  + m) 

x2  = j(zi  ~Z2  +z3  +m) 

, (3.127) 

X3  = - (— "Z\  + Z2  + Z3  + m) 

X4  = i (— Zx  — Z2  — Z3  + m) 

Design  points  for  quaternary  systems  are  chosen  (in  coordinates  Zx,  Z2,  Z3)  from 
the  following  sets 

1)  Vertices  of  a tetrahedron  similar  to  the  concentration  one  with  the  coordi- 
nates of  the  vertices: 

(a,  a,  -a);  (a,  -a,  a);  (-a,  a,  a);  (-a,  -a,  -a) 

2)  Vertices  of  a tetrahedron: 

(b,  b,  b);  (b,  -b,  b);  (b,  b,  -b);  (-b,  -b,  b) 

3)  Points  on  the  axes: 

(±h,  0,  0);  (0,  ±h,  0);  (0,  0,  ±h) 

4)  Vertices  of  tetrahedrons  with  the  coordinates: 
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(-r,  -s,  -t);  (-r,  s,  t);  (r,  -s,  t);  (r,  s,  -t) 

(-t,  r,  s);  (t,  -r,  s);  (t,  r,  -s);  (-t, -r,  -s) 

(-s,  t,  r);  (s,  -t,  r);  (s,  t,  -r);  (-s, -t,  -r) 

After  one  or  other  Draper-Lawrence  design  has  been  constructed  for  quaternary 
systems,  polynomials  are  obtained  in  three  independent  variables  Zi,  Z2  and  Z3  of 
first  degree  (nj=l,  for  n2=2): 

a)  First-degree  polynomial  (n3=l,  for  n2=2) 

y=bo+b1Z1+b2Z2+b3Z3  (3.128) 

b)  Second-order  polynomial  (n,=2,  for  n2=3) 

y = b0  + b1Z1  + b2Z2  + b3  Z3  + b12  Zx  Z2  + b13  Zx  Z3  + b23  Z2  Z3 

+bnz\  + b22Z22  + b33z\  (3.129) 

Parameters  (in  fractions  of  m)  for  some  designs  of  Draper-Lawrence  containing  no 
more  than  12  points,  at  q=4,  n3=l,  n2=2,  are  provided  in  Table  3.48. 


Table  3.48  Parameters  of  Draper-Lawrence  design  for  q=4,  n3  = 1 ,n2  = 2 


Set  (n„  n2) 

Center  points 

Design  points  total-N 

Parameters 

(1.2) 

0 

8 

a=0.548 

b=0.315 

(1.2) 

1 

9 

a=0.567 

b=0.344 

(1.2) 

2 

10 

a=0.602 

b=0.371 

(1.2) 

3 

11 

a=0.626 

b=0.397 

(1.2) 

4 

12 

a=0.650 

b=0.421 

(1.3) 

0 

10 

a=0.550 

h=0.628 

(1.3) 

1 

11 

a=0.568 

h=0.674 

(1.3) 

2 

12 

a=0.585 

h=0.718 

(4) 

0 

12 

r=0.539,  t=0.500 

s=0.248 

(4) 

6 

12 

r=0.616,  t=0.300 

s=0.360 

At  N>12,  the  values  of  parameters  r,  s,  t and  N are  to  be  found  from  the  set  of 
equations: 

r2  +s2  +t2  = Nx  rn  /20 

3 (3.130) 

rxsxt=Nxm  /180 

where: 

m-is  the  edge  of  the  concentration  tetrahedron. 

The  design  parameters  (in  fractions  of  m)  for  quaternary  mixtures  with  n3=2, 
n3=3,  and  18<N<24,  are  presented  in  Table  3.49. 

All  the  above  designs  are  built  assuming  that  there  only  exists  a systematic  bias. 
In  actual  practice,  however,  besides  the  systematic  error,  the  experimental  data  also 
contain  random  errors. 

When  minimizing  the  total  error  [21]  the  basic  configuration  of  designs  [20]  may 
be  retained  with  the  coordinates  of  design  points  multiplied  by  a quantity  0>1,  i.e. 
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for  a ternary  system  the  points  (0ZX,  0Z2),  and  for  a quaternary  one  with  (0Z1;  0Z2, 

0Z3)  should  be  used.  The  parameter  0 depends  of  the  random  error  and  the  polyno- 
mial coefficients,  being  close  to  unity  if  the  random  error  is  predominant.  As  for 
each  particular  experiment  the  exact  value  of  0 is  rather  difficult  to  find,  it  may,  to  a 
rather  rough  approximation,  be  taken  to  be  equal  to  1.1  for  ternary  and  1.2  for  qua- 
ternary systems.  As  an  example,  to  minimize  the  total  error  we  transform  the  design 
(1,  3,  4)  for  q=3,  nx=2  and  n2=3,  given  in  Table  3.47. 

The  coordinates  of  the  first  point  for  0=1.1  are  to  be  found  as  follows: 

X,  = i (-30ZX  - 0Z2  \/3  + rnj  = i (-1.1  x 0.437\/3  + l)  = 0.056 

X2  = i (+30ZX  - 0Z2  \/3  + rnj  = j (3  x 1.1  x 0.0  - 1.1  x 0.437\/3  + l)  = 0.056 

X,  = i (20Z2\/3  + rnj  = i (2  x 1.1  x 0.437\/3  + 1 j = 0.888 

The  complete  design  of  the  experiment  is  given  in  Table  3.50. 


Table  3.49  Parameters  of  Draper-Lawrence  design  for  q=4,  n-]  = 2,n2  = 3 


Set 

Center  points 

N 

Parameters 

ai 

a2 

r 

s 

t 

(1.1,4) 

0 

20 

0.673 

0.0945 

0.684 

0.260 

0.0524 

(1.1.4) 

1 

21 

0.679 

0.179 

0.694 

0.270 

0.0564 

(1.1.4) 

2 

22 

0.685 

0.248 

0.702 

0.274 

0.0532 

(1.1.4) 

3 

23 

0.690 

0.315 

0.708 

0.268 

0.0406 

(1.1.4) 

4 

24 

0.694 

0.393 

0.710 

0.242 

0.00912 

a 

b 

r 

s 

t 

(1.2.4) 

1 

21 

0.676 

0.165 

0.696 

0.274 

0.0784 

(1.2.4) 

2 

22 

0.680 

0.220 

0.706 

0.281 

0.106 

(1.2,4) 

3 

23 

0.683 

0.272 

0.717 

0.274 

0.144 

(1,2.4) 

4 

24 

0.685 

0.317 

0.727 

0.226 

0.225 

a 

h 

r 

s 

t 

(1.3,4) 

0 

22 

0.682 

0.319 

0.0807 

0.291 

0.702 

(1,3,4) 

1 

23 

0.686 

0.390 

0.0925 

0.306 

0.708 

(1,3,4) 

2 

24 

0.690 

0.459 

0.104 

0.321 

0.710 

ai 

a2 

b 

h 

(1,1, 2, 3) 

0 

18 

0.292 

0.667 

0.279 

0.765 

(1,1, 2, 3) 

1 

19 

0.337 

0.672 

0.292 

0.776 

(1,1, 2, 3) 

2 

20 

0.380 

0.674 

0.305 

0.786 

(1,1, 2, 3) 

3 

21 

0.420 

0.676 

0.318 

0.795 

(1,1, 2, 3) 

4 

22 

0.460 

0.674 

0.332 

0.805 

(1,1, 2, 3) 

5 

23 

0.501 

0.669 

0.346 

0.814 

(1,1, 2, 3) 

6 

24 

0.548 

0.656 

0.359 

0.822 

ai 

a2 

a3 

b 

h 

(1,1, 1,2,3) 

0 

22 

0.679 

0.442 

0.132 

0.326 

0.805 

(1,1, 1,2, 3) 

1 

23 

0.683 

0.455 

0.191 

0.332 

0.814 

(1,1, 1,2, 3) 

2 

24 

0.691 

0.441 

0.288 

0.340 

0.822 
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Table  3.49  (continued) 


Set 

Center  points 

N 

Parameters 

3l 

a2 

bi 

b2 

h 

(1,1, 2, 2, 3) 

0 

22 

0.677 

0.451 

0.126 

0.321 

0.805 

(1,1, 2, 2, 3) 

1 

23 

0.677 

0.479 

0.181 

0.315 

0.814 

(1,1, 2, 2, 3) 

2 

24 

0.672 

0.517 

0.275 

0.275 

0.822 

(1,1, 2, 3, 3) 

0 

24 

0.680 

0.494 

0.329 

0.317 

0.818 

Table  3.50  Design  matrix  {1 ,3,4}  for  q=3,  n-|=2,  n2=3,  0=1 ,1 


N 

Xr 

x2 

x3 

N 

X, 

x2 

x3 

1 

0.056 

0.056 

0.888 

8 

0.050 

0.617 

0.333 

2 

0.056 

0.888 

0.056 

9 

0.617 

0.050 

0.333 

3 

0.888 

0.056 

0.056 

10 

0.170 

0.170 

0.660 

4 

0.016 

0.418 

0.566 

11 

0.500 

0.500 

0.000 

5 

0.248 

0.651 

0.101 

12 

0.333 

0.333 

0.333 

6 

0.418 

0.016 

0.566 

13 

0.333 

0.333 

0.333 

7 

0.651 

0.248 

0.101 

Example  3.20  [12] 

The  dependence  of  viscosity  at  30  °C  on  composition  is  studied  for  the  liquid  com- 
plex fertilizer  consisting  of  diammonium  phosphate,  potash  and  water  (NH4)2HP04, 
K2C03,  H20.  For  the  investigation,  the  region  of  unsaturated  solutions  for  both  salts 
at  30  °C  is  selected  (Fig.  3.28),  the  side  of  the  concentration  triangle  being  m=0.5. 


*3 


Figure  3.28  Local  factor  space 
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A Draper-Lawrence  design  containing  13  points  is  performed  (Table  3.51).  It  is 
convenient  to  treat  the  subregion  studied  as  a concentration  triangle  in  the  new 
coordinate  system  (X/,  X2',  X3'): 

X1'+X2'+X3=0.5 

The  coordinates  Xj  and  Xj  are  related  by: 

X:  = 2X,  1 

X2  = 2X2  > (3.131) 

X3  = 1 - X(  - X2  = 1 - 2Xj  - 2X2  J 

we  also  have,  by  Eq.  (3.123): 

Zj  = - Xj  + X2^j  = — X3  + X2 

z2  = ^ (-x;  + X2  + 2X3 ) = ^ (X3  - 2Xj  - 2X3 ) (3.132) 

The  design  of  the  experiment  and  measurements  of  viscosity  for  two  parallel 
experiments  are  given  in  Table  3.51.  From  the  table,  the  coefficients  of  the  regres- 
sion equation  are  obtained  by  the  method  of  least  squares, 

y = 1.54  - 0.94ZX  - 1.01Z2  - 8.93ZjZ2  + 10.48Zi  + 0.76Z2 


Table  3.51  Design  matrix  and  experimental  data 


N 

Zr 

z2 

Xf 

x2' 

x/ 

x3 

x2 

x3 

y 

1 

0 

0.437 

0.081 

0.081 

0.838 

0.040 

0.040 

0.920 

1.033 

2 

0.378 

-0.218 

0.081 

0.837 

0.082 

0.040 

0.418 

0.542 

4.873 

3 

-0.378 

-0.218 

0.837 

0.081 

0.082 

0.418 

0.040 

0.542 

4.722 

4 

0.183 

0.183 

0.044 

0.410 

0.546 

0.022 

0.205 

0.772 

1.481 

5 

0.183 

-0.183 

0.256 

0.622 

0.122 

0.128 

0.311 

0.561 

3.294 

6 

-0.183 

0.183 

0.410 

0.045 

0.545 

0.311 

0.128 

0.561 

2.996 

7 

-0.183 

-0.183 

0.622 

0.256 

0.122 

0.205 

0.023 

0.772 

2.160 

8 

0.258 

0 

0.075 

0.592 

0.333 

0.092 

0.092 

0.816 

1.430 

9 

-0.258 

0 

0.592 

0.075 

0.333 

0.241 

0.241 

0.518 

3.624 

10 

0 

0.258 

0.184 

0.184 

0.632 

0.038 

0.296 

0.666 

2.423 

11 

0 

-0.258 

0.482 

0.482 

0.036 

0.296 

0.038 

0.666 

2.165 

12 

0 

0 

0.333 

0.333 

0.333 

0.167 

0.167 

0.666 

2.191 

13 

0 

0 

0.333 

0.333 

0.333 

0.167 

0.167 

0.666 

2.207 

The  equation  derived  is  found  to  be  an  adequate  fit  to  the  experiment.  By 
Eq.  (3.132)  the  estimated  regression  equation  in  the  natural  scale  takes  the  form. 

y = 1.54  + 2.1Xj  + 0.22X2  - 0.58X3  + I.I8X1  + 21.81X2 
-18.93XjX2  +4.14XjX3  - 6.17X2X3  +0.25X3 

Geometric  interpretation  of  design  points  in  Draper-Lawrence  design  is  shown  in 
Fig.  3.29. 


540 


III  Mixture  Design  “Composition-Property 


3.8 

Factorial  Experiments  with  Mixture 


In  certain  practical  applications  it  is  advantageous  to  consider  variation  of  properties 
not  with  absolute  amounts  of  components,  but  with  their  ratios.  If  the  percentage  of 
each  component  is  not  zero,  then  given  upper  and  lower  constraints  for  the  compo- 
nents, ratios  of  components  may  be  utilized  to  build  conventional  factorial  designs 
[22].  The  number  of  ratios  in  a q-component  system  is  q-1: 


X1+X2+...+Xcpl 


Zi=^ 


-<rA 

z2=^. 

x2  2 x2 


.Z;  =5-. 
1 X. 

; 


Z =Vi 

■ I-1  xq 


(3.133) 


Thus,  using  the  component  ratios  as  independent  factors,  the  dimensionallity  of 
the  problem  is  reduced  by  one,  and  hence  the  number  of  experiments  is  also 
decreased. 

Figure  3.30  shows  Kenworthy  designs  [22]  22  and  23  for  handling  the  variation  of 
property  with  component  ratios  Z1=X1/X3  and  Z2=X3/X2. 

The  points  on  the  line  originating  from  vertex  X2  feature  a constant  ratio  of  com- 
ponents X3  and  X3.  In  a like  manner,  the  line  originating  from  vertex  X1  is  the  locus 
of  equal  ratios  ofX3  to  X2.  To  meet  the  orthogonality  condition  for  the  design  matrix 
the  recourse  is  made  to  the  linear  transformation  of  Eq.  (2.59). 
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x2  x2 


Figure  3.30  Designs  using  component  ratios 


Example  3.21 

We  seek  the  functional  relationship  between  the  yield  of  sodium  and  potassium 
bicarbonates  and  the  composition  of  the  initial  sylvanite  solution.  The  factors  con- 
trolling the  potassium  utilization,  in  the  carbonization  process  are  chosen  to  be  the 
per  cent  ratios  of  two  of  three  components  making  up  the  system: 

Z1=NaCl/KCl;  Z2=H20/NaCl 

To  derive  the  regression  equation,  we  shall  use  a second-order  orthogonal  design 
for  k=2,  N=9  and  the  star  arm  a=l.  Fig.  3.31. 


KCI 


The  region  where  independent  factors  will  be  studied  is  given  in  Table  3.52,  and 
the  design  matrix  in  Table  3.53. 
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Table  3.52  Factor  variation  levels 


Name 

Z, 

z2 

Null  level 

3.315 

5.53 

Variation  interval 

1.935 

1.53 

Upper  level 

5.25 

7.06 

Lower  level 

1.38 

4.00 

Table  3.53  Design  matrix 


N 

X0 

X, 

x2 

x,x2 

X,' 

x2' 

y' 

y" 

y 

Sy 

V 

y-y  ( y 

’-y)2 

1 

+ 

+ 

+ 

+ 

+1/3 

+1/3 

10 

14 

12 

8 

10.74 

1.26 

1.59 

2 

+ 

- 

- 

+ 

+1/3 

+1/3 

77.5 

79.5 

78.5 

2 

77.74 

0.76 

0.58 

3 

+ 

+ 

- 

- 

+1/3 

+1/3 

27 

28.6 

27.8 

1.28 

27.84 

0.04 

0.001 

4 

+ 

- 

+ 

- 

+1/3 

+1/3 

59.5 

62.5 

61 

4.5 

60.64 

0.36 

0.13 

5 

+ 

+ 

0 

0 

+1/3 

-2/3 

18.5 

17.5 

18 

0.5 

19.3 

1.3 

1.69 

6 

+ 

- 

0 

0 

+1/3 

-2/3 

68.2 

67.8 

68 

0.08 

69.19 

1.19 

1.42 

7 

+ 

0 

+ 

0 

-2/3 

+1/3 

30 

34 

32 

8 

32.59 

0.59 

0.35 

8 

+ 

0 

- 

0 

-2/3 

+1/3 

51 

49 

50 

2 

49.69 

0.31 

0.09 

9 

+ 

0 

0 

0 

-2/3 

-2/3 

43 

40.2 

41.6 

3.92 

41.14 

0.46 

0.21 

Variances  of  replicated  design  points  are  equal,  so  that  the  reproducibility  var- 
iance is: 

9 

V s2 

s2  = fey-  =3<^28  = 3.37;/  = N(n  - 1)  = 9(2  - 1)  = 9 

The  coefficients  of  the  regression  equation  and  their  errors  are  computed  from 
Eqs.  (2.102)  and  (2.107). 


b0  = 43.21;  bt  = -24.95;  b2  = -8.55;  b12  = 0.425;  bu  = 3.1;  b22  = 1.1 
Sbj  = 0.75;  SK  = 1.3;  Sfc..  = 0.92 

All  regression  coefficients  except  b12  and  b22  are  statistically  significant  with  95% 
confidence  level  and  the  regression  equation  takes  the  form: 

y = 41.14  - 24.95%!  - 8.55X2  + 3. IX2 


The  adequacy  variance  is  obtained  from  the  formula: 


r4  i= 1 


/ ~\2 

£(*->’) 


N-e 


= 2.46 


where: 

£-is  the  number  of  significant  coefficients. 
The  regression  model  is  adequate. 
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3.9 

Full  Factorial  Combined  with  Mixture  Design-Crossed  Design 

In  1981,  Cornell  [23]  published  results  of  experimental  studies  where  the  quality  of  a 
fish  patty  was  defined  by  both  its  composition  and  production  process.  The  experi- 
ment included  seven  different  mixtures  prepared  by  mixing  different  species  of  fish 
and  then  subjecting  the  resulting  patty  to  various  cooking  conditions  and  defined  in 
accord  with  simplex-centroid  design,  Fig.  3.32.  The  preparation  procedure  of  the 
defined  mixtures  or  a formulation  included  three  control  process  factors  with  asso- 
ciated variation  levels:  baking  temperature  from  190  °C  to  218  °C;  time  in  the  oven 
from  25  [min]  to  40  [min];  deep  fat  frying  time  from  25  s to  40  s.  Process  factor  levels 
have  been  varied  in  accordance  with  the  23  full  factorial  experiment.  Design  of 
experiment  7x23  with  56  trials  of  a simplex-centroid  design  x 23  full  factorial  design 
has  been  sufficient  for  mathematical  modelling  of  the  observed  phenomenon, 

Fig.  3.33.  A regression  model  with  56  regression  coefficients  or  reduced  regression 
model  with  18  coefficients  [24]  has  been  sufficient  for  an  adequate  description  of  the 
problem. 


x,-i 


Figure  3.32  Simplex-centroid  design 


This  was  the  first  example  of  application  of  a mixture  design  x process  factor  design 
in  experimental  studies.  Since  such  designs  contain  a relatively  large  number  of  fac- 
tors, it  is  of  interest  to  replace  full  factorial  designs  of  process  factors  with  fractional 
factor  designs.  Examples  of  such  designs  with  applications  on  an  industrial  level 
were  presented  in  works  of  Wagner  and  Gorman  [25];  John  and  Gorman  [26];  Ziegel 
[27]. 
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Figure  3.33a  Simplex-centroid  design  in  each  point  of  a 2 full 
factorial  experiment 


Figure  3.33b  2 Factorial  design  in  each  point  of  a simplex-cen- 

troid design 


The  design  of  experiment  with  its  outcomes  according  to  Cornell’s  [23]  researches 
is  given  in  Table  3.54.  The  patty  composition  was  defined  by  proportions  of  X3,  X2, 
X3.  Seven  compositions  are  given  in  the  above  table,  composition  (1/2,  1/2,  0)  means 
a mixture  of  50%  X3,  50%  X2  and  0%  X3.  Eight  combinations  of  process  factors  Z1; 
Z2  and  Z3  are  defined  in  coded  values  (-1,+1).  Coded  values  of  process  factors  are 
determined  in  this  way: 


Zi 


z1  -204 


z2— 32.5 
2 7.5  ’ 


z3-32.5 


14 


7.5 


(3.134) 
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Table  3.54  Simplex-centroid  x full  factorial  design 


Coded  values  of 
process  factors 

Proportion  of  com 

ponents  (xn, 

*2.  X3) 

_N 

N 

z3 

(1,0,0) 

(0,1,0) 

(0,0,1) 

(1/2, 1/2,0) 

(1/2, 0,1/2) 

(0,1/2, 1/2)  (1/3, 1/3, 1/3) 

- - 

- 

1.84 

0.67 

1.51 

1.29 

1.42 

1.16 

1.59 

+ - 

- 

2.86 

1.10 

1.60 

1.53 

1.81 

1.50 

1.68 

- + 

- 

3.01 

1.21 

2.32 

1.93 

2.57 

1.83 

1.94 

+ + 

- 

4.13 

1.67 

2.57 

2.26 

3.15 

2.22 

2.60 

- 

+ 

1.65 

0.58 

1.21 

1.18 

1.45 

1.07 

1.41 

+ - 

+ 

2.32 

0.97 

2.12 

1.45 

1.93 

1.28 

1.54 

- + 

+ 

3.04 

1.16 

2.00 

1.85 

2.39 

1.60 

2.05 

+ + 

+ 

4.13 

1.30 

2.75 

2.06 

2.82 

2.10 

2.32 

The  system  response  is  texture  measured  as  the  pressure  in  milligrams  required 
to  puncture  the  patty  in  a standardized  setup.  The  response  was  obtained  by  replicat- 
ing trials.  The  complete  56  coefficients  regression  model  is  obtained  by  multiplying 
each  member  in  a seven-coefficient  model  for  the  mixture  composition. 

y = ^X,  + (52X2  + P3X3  + P12XjX2  + P13X1Xj  + P23X2X3  + fimXrX2X3  (3.135) 
with  each  of  eight  coefficients  in  the  regression  model  of  process  factors. 

y = a0  + 04  Zj  + a 2Z2  + a3Z3  + a12ZxZ2  + a13ZxZ3 

+a23Z2Z3  + ai23ZiZ2Z3  + 8 (3.136) 

The  result  of  multiplication  is  the  regression  model: 

3 r 3 3 

y (X, Z)  = £ b?  + Zbfzl+  £ bTz,Zm  + bj23Z1Z2Z3  X; 

1=1  l—l  l~On<3 

3 3 

+ E ^ + E4Zi+  E b$,ZlZm  + b™Z1Z1Z3  XiXj  (3.137) 

H/<3  l—l  l~<m<3 

3 3 

+ fo?23+Efei23Zi+  E bu3ZlZm  + bu3Z1Z2Z3  XxX2X3 

1=1  /-<m<3 

Estimates  of  regression  coefficients  based  on  results  from  Table  3.54  are  given  in 
Table  3.55.  It  should  be  mentioned  that  coefficient  notation  b y corresponds  to  Schejfe 
notation,  and  index  (.  is  conected  to  process  factors,  while  index  ij  has  to  do  with 
mixture  of  components. 
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n 

Table  3.55  Estimates  bsr  and  associated  standard  errors 


Marks 

Averages 

Zi 

z2 

z3 

z,z2 

Z3Z3 

Z2Z3 

z,z2z3 

SE 

X3 

2.87* 

0.49* 

0.71* 

-0.09 

0.07 

-0.05 

0.10 

0.04 

0.05 

x2 

1.08* 

0.18* 

0.25* 

-0.08 

-0.03 

-0.05 

-0.03 

-0.04 

0.05 

x3 

2.01* 

0.25* 

0.40* 

0.01 

0.00 

0.17* 

-0.05 

-0.04 

0.05 

x,x2 

-1.14* 

-0.81* 

-0.59 

0.10 

-0.06 

0.14 

-0.19 

-0.09 

0.23 

x,x:i 

-1.00* 

-0.54 

-0.01 

-0.03 

-0.06 

-0.27 

-0.43 

-0.12 

0.23 

x2x3 

0.20 

-0.14 

0.07 

-0.19 

0.23 

-0.25 

0.12 

0.27 

0.23 

x3x2x3 

3.18 

0.07 

-1.41 

0.11 

1.74 

-0.71 

1.77 

-1.33 

1.65 

* Statistically  significant  regression  coefficients  with  99%  confidence 

Table  3.56 

£ 

Estimates  bsr  for  a 

reduced 

regression  model 

Marks 

Averages 

z, 

Z2 

Z3  2^2 

z3z3 

■^2^3  Z1Z2Z3 

SE 

x3 

2.87 

0.49 

0.70 

-0.06* 

-0.06* 

- 

0.04 

x2 

1.10 

0.17 

0.26 

-0.06* 

-0.06* 

- 

0.04 

x3 

2.03 

0.24 

0.39 

-0.06* 

0.12* 

- 

0.04 

x,x2 

-1.17 

-0.80 

-0.66 

- 

- 

- 

0.20 

x,x3 

-1.03 

-0.52 

- 

- 

- 

- 

0.20 

x2x3 

- 

- 

- 

- 

- 

- 

- 

x,x2x, 

3.67 

- 

- 

- 

- 

- 

1.32 

* bj3=b23= 

b33=0.06; 

SE=0.02 

b313=b213 

=-0.06; 

SE=0.02  b313=0.12;  SE=0.04 

Most  of  the  regression  coefficients  in  Table  3.55  are  not  statistically  significant  so 
that  the  t-test  suggest  that  a reduced  regression  model  may  adequately  describe 
experimental  outcomes.  Regresssion  coefficients  and  regression  analysis  for  a 
reduced  regression  model  are  given  in  the  work  of  Gorman  and  Cornell  [24]  and  its 
summarized  results  in  Table  3.56. 

All  regression  coefficients  in  Table  3.56  are  statistically  significant  with  99%  con- 
fidence. 

In  the  case  of  constraints  on  proportions  of  components  the  approach  is  known, 
simplex-centroid  designs  are  constructed  with  coded  or  pseudocomponents  [23], 
Coded  factors  in  this  case  are  linear  functions  of  real  component  proportions,  and 
data  analysis  is  not  much  more  complicated  in  that  case.  If  upper  and  lower  con- 
straints (bounds)  are  placed  on  some  of  the  X;  resulting  in  a factor  space  whose 
shape  is  different  from  the  simplex,  then  the  formulas  for  estimating  the  model 
coefficients  are  not  easily  expressible.  In  the  simplex-centroid  x 23  full  factorial 
design  or  simplex-lattice  x 2n  design  [5],  the  number  of  points  increases  rapidly  with 
increasing  numbers  of  mixture  components  and/or  process  factors.  In  such  situa- 
tions, instead  of  full  factorial  we  use  fractional  factorial  experiments.  The  number  of 
experimental  trials  required  for  studying  the  combined  effects  of  the  mixture  com- 
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ponents  and  process  factors  depends  on  the  form  of  the  combined  model  to  be  fitted 
in  the  X;'s  and  Z;'s.  As  an  alternative  to  fitting  the  complete  model  containing  2q+n- 
2n  terms,  let  us  define  the  reduced  combined  model  with  fewer  than  2q+n-2n  terms. 
The  reduced  model  can  be  written  using  the  following  abbreviated  models: 

q n q 

M1+1  : E P;  + E E P|xjZ* 

i=  1 i=  1 i—1 

M1+2:  E EP  TXiZeZm 

£-<m<n  i—1 


Ml+3 


E EP ■mpxizezmzp  I 

£l~an^<p<n  i=  1 j 


(3.138) 


M1+n:Z(S'2-nXiZ1Z2.:Zn 

i—1 

n 

M2+1 : E Pjxx:  + E E P yX,^Ze 

i~<j<q  £=1 

^2+2  : E E P^mx>xJ2fzm 

£<m<n  i<j<q 

n 

M3+i  : E P-^.x,  + e E P 

i^,j^,k<q  i—1  i^j~<k<q 

M3+n:  E Pp"XiXJ.XfcZ1Z2...Z„ 

i~<j~(k<q 


(3.139) 


where: 

the  subscript  r+s  in  Mr+S  refers  to  the  inclusion  of  terms  of  degrees  r and  s in  the 
Xj's  and  Z;'s,  respectively. 

The  model  Mr+1  contains  the  r-th  degree  term  in  the  mixture  components  only 
along  with  the  product  of  this  term  with  the  first  degree  terms  in  the  Z;'s.  For  exam- 
ple, a planar  or  first-degree  model  in  the  mixture  components,  and  a main  effects 
only  model  in  the  process  variables,  is  y=M1+1+e.  A planar  model  in  the  Xj's,  com- 
bined with  a main  effect  plus  first-order  interaction  effects  model  in  the  Zfs,  would 
be  y=M1+1+M1+2+e.  The  model  containing  up  to  quadratic  blending  terms  by  main 
effects  in  the  Z;'s  is  defined  as  y=M1+1+M2+1+£.  This  continues,  up  to  the  complete 
2q+n-2n  term  model  that  is  defined  as: 


y — Ml+l  + M2+1  + ...  + Mq+n  + E 
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The  choice  of  design  arrangement,  in  terms  of  the  number  of  points  as  well  as 
the  coordinate  settings  of  the  points,  depends  on  the  specific  form  of  the  model  to 
be  fitted.  For  the  mixture  components,  if  only  the  first-degree  terms  in  the  X;  are  to 
be  fitted,  then  just  two  values  of  X;  are  required  (say,  XpO  and  Xpl).  If  the  quadratic 
blending  terms  are  to  be  fitted  also,  then  the  binary  blends  XpXpl/2,  XjpO,  are  re- 
quired in  addition  to  the  settings  XpO;  1.  Similar  arguments  hold  for  the  inclusion 
of  the  terms  representing  the  different  combinations  of  levels  of  the  process  vari- 
ables. The  number  of  combinations  depends  on  the  types  of  effects  (main  effects, 
two-factor  interactions,  and  so  on)  that  are  to  be  estimated.  In  this  section  we  shall 
be  interested  primarily  in  measuring  the  main  effects  of  the  process  variables,  par- 
ticularly as  they  affect  the  blending  properties  of  the  mixture  components.  To  begin 
investigation  of  the  types  of  designs  that  are  possible,  we  shall  limit  the  discussion 
for  the  moment  to  the  case  of  two  mixture  components  q=2,  whose  proportions  are 
denoted  by  X1  and  X2,  combined  with  three  process  factors  n=3,  which  are  represent- 
ed by  Zi,  Z2,  and  Z3.  Also,  for  simplicity  of  illustration,  let  us  write  the  model  to  be 
fitted  as  y=M1+1+M2+1+e.  Although  simple  in  form,  this  model  provides  measures  of 
the  linear  effects  of  the  individual  components,  the  quadratic  (nonlinear)  effect  of 
blending  the  two  components,  and  the  main  effects  of  the  three  process  factors  on 
the  linear  and  nonlinear  effects  of  the  two  components.  To  fit  the  model 
y=M1+1+M2+1+e,  data  is  collected  for  each  of  the  three  blends  (X3,  X2)=(1,0);  (0,1);  (1/ 
2,  1/2)  for  each  of  the  selected  combinations  of  the  levels  of  Z1;  Z2  and  Z3.  Let  us 
denote  the  two  mixture  components  whose  proportions  were  represented  previously 
by  X3,  X2  by  A and  B.  The  presence  (absence)  of  the  lower  case  letters  a or  b in  the 
trial  symbol  represents  the  presence  (absence)  of  the  component.  The  three  blends 
defined  previously  as  (1,0);  (0,1)  and  (1/2,  1/2)  are  now  a,  b and  ab,  respectively. 

Let  us  denote  the  three  process  factors  by  D,  E,  F,  where  the  presence  (absence) 
of  the  lower  case  letters  d,  e or  f in  the  trial  designation  indicates  that  the  high  (low) 
level  of  the  factor  is  used.  The  eight  combinations  of  factor  levels  are  (1),  d,  e,  de,  f, 
df,  ef  and  def.  The  trial  designation  ad  means  component  A is  present  (but  compo- 
nent B is  not)  at  the  high  level  of  factor  D and  low  levels  of  factors  E and  F.  The  trial 
designation  abd  means  components  A and  B are  present  in  a 1:1  blend  at  the  high 
level  of  factor  D and  low  levels  of  factors  E and  F. 

Two  possible  designs  that  can  be  used  for  collecting  data  from  which  to  estimate 
the  coefficients  for  the  model  y=M1+1+M2+1+e  are  presented  in  Figs.  3.34  and  3.35. 

In  each  design  the  number  of  combinations  of  the  levels  of  the  factors  D,  E and  F 
is  reduced  to  4 from  23=8,  by  using  a (231)  half-fractional  replica.  In  the  design  in 
Fig.  3.34,  the  fraction  consisting  of  the  process-factor  level  combinations  d,  e,  f and 
def  is  set  up  at  each  point  of  composition.  We  denote  this  fraction  of  the  full  factorial 
design  by  I+DEF  because  the  overall  mean  and  the  three-factor  interaction  effect 
DEF  are  measured  jointly  using  the  same  linear  combination  of  the  observations. 
The  main  effect  of  factor  D and  EF  interaction  effect  are  estimated  by  the  same  con- 
trast among  the  observations;  the  same  can  be  said  for  effects  E and  DF,  and  F and 
DE.  In  the  design  in  Fig.  3.35,  the  fraction  I+DEF  is  set  up  at  the  pure  blends  [(X1; 
X2)=(1,0);  (0,1)],  but  the  opposite  fraction,  denoted  by  I-DEF,  consisting  of  the  level 
combinations  (1),  de,  df  and  ef  is  set  up  at  the  (1/2,  1/2)  blend  of  A and  B. 
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ad 

abd 

bd 

ad 

ab 

bd 

ae 

a be 

be 

ae 

abde 

be 

af 

abf 

bf 

af 

abdf 

bf 

adef 

abdef 

bdef 

adef 

abef 

bdef 

- o- 

- o- 

— o— 

- o- 

- o- 

- o- 

A:  1 

1/2 

0 

A:  1 

1/2 

0 

B:  0 

1/2 

1 

B:  0 

1/2 

1 

1 + DEF  1 

+ DEF 1 + DEF 

1 + DEF  1 

- DEF  1 + DEF 

Figure  3.34 

The 

same  23  1 fractional  replicate 

Figure  3.35 

Different  23  1 fractional  replicate 

design  (i+def)  in 

the  process  factors  d,  e,  and  f 

designs  (i±def)  in 

the  process  factors  d,  e and 

set  up  at  th 

le  three  points  of  composition 

f set  up  at  the  three  points  of  composition 

(a,ab,b) 

(a,ab,b) 

In  estimating  the  coefficients  |3X , (32 , p12  in  the  model  y=M1+1+M2+i+E,  two  ques- 
tions of  interest  arise: 

1)  Are  the  calculating  formulas  for  the  estimates  of  the  model  coefficients  the 
same  with  both  designs? 

2)  How,  if  at  all,  do  the  variance  and  bias  of  the  coefficient  estimates  differ  for 
the  two  designs? 

With  the  design  in  Figs.  3.34  and  3.35,  the  calculating  formulas  for  the  coefficient 
estimates  bf  and  b2f  (1=0,  1,  2 and  3)  involve  simple  linear  combinations  of  the 
average  responses  at  the  design  points.  Denoting  the  observed  response  value  to  the 
trial  ad  by  ad  for  example,  the  estimates  of  the  linear  and  main  effect  coefficients  are 
calculated  as: 


hi  = - (ad  + ae+  af  + adef)  = y forXx  = 1 


b\  =- 


b\  =- 


b\=. 


i 

ad+adef 

ae+af 

2 

2 

2 

1 

ae+adef 

ad+af 

2 

2 

2 

1 

af +adef 

ud-\~U6 

2 

2 

2 

= D forXj  = 1 


= E forX,  = 1 


= F for  X1  = 1 


b2  = - ( bd  + he  + hf  + bdef)  = y for  X2  =1 


b\  = D+ 


for  X2  = 1;  b2  = E for  X2  = 1; 


b\  = F+ 


for  X,  = 1 


bn  = 4[y  forXj  = X2  = 1/2]  — 2{[y  forXx  = 1]  + [y  for  X2  = 1]} 
b\2  = 4[main  effect  D for  Xx  = X2  = 1/2] 

— 2{[main  effect  D forXx  = 1]  + [main  effect  D for  X2  = 1]} 
For  the  design  in  Fig.  3.34: 

- (abd  + ahe  + abf  + abdef) 


bu  =4 


-2 


1 1 

- (ad  + ae  + af  + adef)  + - (bd  + be  + bf  + bdef) 


(3.140) 

(3.141) 

(3.142) 

(3.143) 

(3.144) 

(3.145) 

(3.146) 

(3.147) 

(3.148) 
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f \ f abd  + abdef  abe  + abf 


h\i  = 4 


2 V 


-2 


l fad  + adef  ae  + af ^ i fbd+  bdef  be  + bf 


(3.149) 


= 4 D+  for  X1  = X2  = 1/2  - 2 j D+  for  Xx  = 1 + 


D for  X2  = 1 


]} 


For  the  design  in  Fig.  3.35: 


j°  „ 

bu  = 4 


( ab  + abde  + abdf  + abef) 


-2 


1 1 
- (ad  + ae+  af  + adef)  + - (bd  + be  + bf  + bdef) 


(3.150) 


b\2  =4 


1 / ofode  + afodf  all  + abef 

|_2  V 2 2 


— 2 


151) 


1 fad  + adef  ae  + af^\  i f bd  + bdef  be  + bf 


(3-- 


= 4 [D  for  Xj  = X2  = 1/2]  - 2| 


- 2\  D for  X,  = 1 + D for  X2  = 1 


']} 


Let  us  illustrate  numerically  the  calculation  formulas  for  the  estimates 
bjf,b2f,bi2f,  with  the  designs  in  Figs.  3.34  and  3.35  using  the  data  in  Table  3.54:  The 
first  term  in  b^  is  ad,  which  corresponds  to  the  value  2.86  found  in  the  (1,  -1,  -1) 
row  of  column  (1,  0,  0).  This  is  the  observed  response  when  D is  at  the  high  level, 
and  E and  F are  at  their  low  levels,  which  corresponds  to  (Zx,  Z2  , Z3)  = (1,  -1,  -1). 
The  a indicates  that  only  component  A is  present,  so  one  uses  the  (1,  0,  0)  column. 
Similarly,  one  obtains  for  either  design 

bi  = i (2.86  + 3.01  + 1.65  + 4.13)  = 2.91; 
b\  = I f2-86  + 4-13  - 3-01  + 1-65)  = 0.58 

2 V 2 2 J 

b\  = 0.66;  b\  = -0.02; 

b°2  =1(1.10  + 1.21  + 0.58+  1.30)  = 1.05; 

, i = 1 (1. 10  + 1.30  _ 1.21  + 0.58\  = 0 15 

2 2 V 2 2 / 

bl=  0.21;  b2  = —0.11; 


For  design,  Fig.  3.34: 

b°u  = 4 [1  (1.53  + 1.93  + 1.18  + 2.06)] 

- 2 [1(2.86  + 3.01  + 1.65  + 4.13)  + 1 (1.10  + 1.21  + 0.58  + 1.30)] 

= 6.70  - 2(2.91  + 1.05)  = -1.22 
= 4 ^1.53  + 2.06  _ 1.93  + 1. 18^j 

_ 1 [I  /2.86  + 4.13  _ 3.01  + 1.65\  1 (1. 10+  1.30  _ 1.21  + 0.58N 

L2  V 2 2 7+2V  2 2 

= 0.48  - 2(0.58  + 0.15)  = -0.98 

b\2  = -0.46;  b312  = 0.04. 


3.9  Full  Factorial  Combined  with  Mixture  Design-Crossed  Design  | 551 

For  design,  Fig.  3.35: 
b°u  = 4 p1  (1.29  + 2.26  + 1.45  + 1.85)] 

-I  1 

= -1.07 


bu  =4 


- 2 [i  (2.86  + 3.01  + 1.65  + 4.13)  + i (1.10  + 1.21  + 0.58  + 1.30)] 
1 /2.26  + 1.45  1.29  + 1.851 


2 2 
_ 2 n 72.86  + 4.13  _ 3.01  + 1.65 


=)] 


1^1.10  + 1.30  1.21  + 0.58 


)] 


= -0.89 


bu  = -0.37;  bu  = 0.01. 


The  regression  model  for  design  Fig.  3.34  is: 
y(X,  Z)  = (2.91  + 0.58Z!  + 0.66Z2  - 0.02Z3)X1 
+(1.05  + 0.15Z1  +0.21Z2  -0.11Z3)X2 

+(-1.22  - 0.98ZX  -0.46Z2  +0.04Z3)X1X2  (3.152) 

The  regression  model  for  design  Fig.  3.35  is: 
y(X,  Z)  = (2.91  + 0.58Zj  + 0.66Z2  - 0.02Z3)X1 
+(1.05  + O.lSZj  + 0.21Z2  - 0.11Z3)X2 

+(-1.07  - 0.89ZX  -0.37Z2 +0.01Z3)X1X2  (3.153) 

£ £ l 

The  closeness  of  the  values  of  the  estimates  bi , o2 , bu , obtained  with  the  designs 
in  Figs.  3.34  and  3.35,  to  the  values  of  the  corresponding  estimates  in  Table  3.55 
obtained  from  the  full  factorial  design  is  evidence  of  the  absence  of  interaction 
effects  among  the  process  factors  (particularly  with  respect  to  the  effect  of  the  two 
components  whose  proportions  are  denoted  by  Xj  and  X2).  The  approximate  equality 
of  the  values  of  the  estimates  obtained  with  the  fractional  and  full  factorial  designs 
lends  support  to  our  decision  to  consider  the  use  of  fractional  designs  in  the  process 
factors  when  interactions  among  these  factors  are  negligible. 

Thus  far  we  have  discussed  only  the  two  components  simplex-centroidxthree  process 
factor  full  factorial  design.  By  keeping  the  design  and  model  simple,  the  calculating 
formulas  for  the  coefficient  estimates  were  easily  written  for  the  two  fractional  repli- 
ca designs  presented.  When  more  than  two  mixture  components  are  combined  with 
more  than  three  process  factors  and  the  simplex-centroid  design  is  joined  with  a full 
factorial  design  or  some  fraction  of  the  full  factorial,  similar  formulas  involving  the 
trials  combinations  can  be  set  up  for  estimating  the  model  coefficients.  For  example, 
Table  3.57  lists  several  possible  fractional  designs  (denoted  by  roman  numerals  I,  II, 
III,  IV  and  V)  along  with  the  corresponding  estimable  coefficients  in  the  fitted  mod- 
els for  experiments  where  three  components  are  combined  with  three  process  fac- 
tors. A strategy  that  can  be  used  to  generate  fractions  of  the  simplex-centroid  x2n 
designs,  and,  that  was  in  fact  used  to  generate  the  fractional  designs  in  Table  3.57,  is 
given  in  Scheffe’s  work  [6]. 


552  | III  Mixture  Design  “Composition-Property" 

When  three  components  X3,  X2  and  X3  are  combined  with  three  process  factors 
Zx,  Z2  and  Z3,  two  possible  designs,  similar  to  designs  in  Figs.  3.34  and  3.35,  are 
presented  in  Figs.  3.36a  and  3.36b. 


bf 

bdef 


x1=i 


ad 


X3=! 


bcdef 

bcf 


cf 

cdef 


Figure  3.36a  Fractional  design  in  each  point  of  a simplex-cen- 
troid design 


bf  bcdf  cf 

bdef  beef  cdef 


Figure  3.36b  Fractional  design  in  each  point  of  a simplex-cen- 
troid design 


3.9  Full  Factorial  Combined  with  Mixture  Design-Crossed  Design  | 553 
Tabel  3.57  Simplex-centroid  x full  factorial  design  (q=3,  n=3) 


Design  N Design  a Design  b Regression  coefficients 


IV 


[process  factors  main  effects]  + 6 degrees  of  freedom 


Q-PFE23  0-(l),def  O'  (l)4e,df,ef  Q-e,df  Q-d.ef  ^-d,e,f,def 

The  given  designs  are  used  for  fitting  a three-component  simplex-centroid  (or  an 
incomplete  cube  model)  with  main  effects  of  process  factors: 

3 

? = E P^i  + E I + P?23X1X2X3 

i—1  i^j<3 


3 


+ E 

i=l 


3 


ep!e  + e p^  + Ceee 

i—1  i^j<3 


Zi  - 1-  £ 


(3.154) 
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Figure  3.37  Fractional  simplex  designs  by  Figs.  3.36a  and  3.36b 


3.10  Examples  of  Complex  Optimizations  of  Mixture  Problems 
Example  I [28] 

A nine-component  composition  has  been  tested  in  developing  a new  composite 
material.  Three  components  were  the  binder  of  the  composite  material:  polyester 
EPX-279-1,  polyester  EPX-187-3  and  styrene.  These  materials  have  been  used  as  fill- 
ers: ash,  marble  powder,  glass  microspheres,  saran  microspheres,  wollastonite  and 
powder  made  by  grinding  shells.  Component  properties  with  variation  range  of  their 
proportions  are  given  in  Table  3.58. 


Table  3.58  Component  properties 


Component 

Density 

Av.  Diameter 

Price 

Price 

Variation 

Denotation 

g/cm3 

um 

d=/lb 

T/l 

Range 

Ash 

X, 

2.580 

4.5 

1.97 

11.20 

1-5 

Marble 

x2 

2.710 

40.0 

0.50 

3.00 

1-25 

Glass 

x3 

0.340 

62.0 

69.00 

52.00 

1-25 

Saran 

x4 

0.032 

30.0 

350.00 

26.30 

1-25 

Wollastonite 

X5 

2.890 

30.0 

1.85 

11.80 

1-25 

Shell  powder 

X6 

1.300 

125.0 

2.50 

7.16 

5-29 

Polyester  epx-279-1 

X7 

1.148 

- 

26.80 

67.80 

58-80 

Polyester  epx-187-3 

X8 

1.335 

- 

18.00 

56.50 

5-29 

Styren 

x9 

0.900 

- 

11.50 

22.81 

5-29 
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To  obtain  a second-order  regression  model,  an  experiment  by  extreme  vertices 
design  has  been  done.  The  experiment  included  45  trials  including  a simplex  center 
for  a check  of  lack  of  fit  of  the  regression  model  and  an  estimate  of  experimental 
error,  as  shown  in  Table  3.59. 

Each  trial  has  been  done  by  mixing  components  in  a corresponding  ratio  on  a 
vertical  mixer.  The  produced  mass  is  poured  into  corresponding  molds,  from  which 
it  is  taken  out  after  800  s.  Samples  are  then  cured  in  a drying  room  for  20  h at  70  °C. 

By  testing  samples,  their  density,  tensile  strength,  bending  strength,  pressing 
strength,  elasticity  module,  bending  modulus  and  elongation  at  break  are  deter- 
mined. Besides,  based  on  composition,  the  expenditure  for  each  trial  has  been  calcu- 
lated. The  outcomes  of  the  experiment  are  in  Table  3.60.  Fitting  experimental  out- 
comes has  been  done  by  a second-order  polynomial: 

8 8 7 8 

y = Bo  + E Bixi  + E Buxf  + E E B^Xj  (3.155) 

i—l  i— 1 j=lj=l+l 

It  should  be  noted  that  regression  coefficients  have  been  calculated  for  all  prop- 
erty measurements  except  density.  Calculations  have  been  done  on  a computer  due 
to  a large  number  of  components  and  properties,  Table  3.61.  The  density  of  the  com- 
posite material  has  been  fitted  with  a linear  regression: 

P = B0  + E (3-156) 

i—i 

Due  to  the  specific  application  of  the  composite  material,  limitations  on  mea- 
sured properties  have  been  imposed  in  one  case: 

p < 1.2g/cm3;  oz~  > 3000psi;  os  > 2500psi 

Op  > 5000  psi;  E > 4 x 105  psi;  Es  > 4 x 105  psi;  e > 15% 

Apart  from  the  mentioned  limitations  on  properties,  limitations  on  components 
have  also  been  imposed: 

6 8 

E ^i  — 0-55;  1-E^<-(X7+X8) 

i=i  i=i  3 

By  application  of  computer  and  associated  programs  for  optimization  with  limita- 
tions -compromising  optimization,  the  composition  of  the  composite  has  been  deter- 
mined: 

X^O.050;  X2=0.025;  X3=0.000;  X4=0.200;  Xs=0.013;  X6=0.235; 

6 

X7  = 0.115;  Xg  = 0.245;  X9  = 0.117;  E Xi  = 0.523;  X7  + X8  = 0.360. 

l=i 
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Table  3.59  Extreme  vertices  design 


N 

x, 

x2 

X3 

x4 

X5 

X6 

x7 

x8 

x9 

1 

0.25 

0.01 

0.01 

0.01 

0.01 

0.05 

0.56 

0.05 

0.05 

2 

0.13 

0.13 

0.01 

0.01 

0.01 

0.05 

0.56 

0.05 

0.05 

3 

0.13 

0.01 

0.13 

0.01 

0.01 

0.05 

0.56 

0.05 

0.05 

4 

0.13 

0.01 

0.01 

0.13 

0.01 

0.05 

0.56 

0.05 

0.05 

5 

0.13 

0.01 

0.01 

0.01 

0.13 

0.05 

0.56 

0.05 

0.05 

6 

0.13 

0.01 

0.01 

0.01 

0.01 

0.17 

0.56 

0.05 

0.05 

7 

0.13 

0.01 

0.01 

0.01 

0.01 

0.05 

0.68 

0.05 

0.05 

8 

0.13 

0.01 

0.01 

0.01 

0.01 

0.05 

0.56 

0.17 

0.05 

9 

0.13 

0.01 

0.01 

0.01 

0.01 

0.05 

0.56 

0.05 

0.17 

10 

0.01 

0.25 

0.01 

0.01 

0.01 

0.05 

0.56 

0.05 

0.05 

11 

0.01 

0.13 

0.13 

0.01 

0.01 

0.05 

0.56 

0.05 

0.05 

12 

0.01 

0.13 

0.01 

0.01 

0.01 

0.05 

0.56 

0.05 

0.05 

13 

0.01 

0.13 

0.01 

0.01 

0.13 

0.05 

0.56 

0.05 

0.05 

14 

0.01 

0.13 

0.01 

0.13 

0.01 

0.17 

0.56 

0.05 

0.05 

15 

0.01 

0.13 

0.01 

0.01 

0.01 

0.05 

0.68 

0.05 

0.05 

16 

0.01 

0.13 

0.01 

0.01 

0.01 

0.05 

0.56 

0.17 

0.05 

17 

0.01 

0.13 

0.01 

0.01 

0.01 

0.05 

0.56 

0.05 

0.17 

18 

0.01 

0.01 

0.25 

0.01 

0.01 

0.05 

0.56 

0.05 

0.05 

19 

0.01 

0.01 

0.13 

0.13 

0.01 

0.05 

0.56 

0.05 

0.05 

20 

0.01 

0.01 

0.13 

0.01 

0.13 

0.05 

0.56 

0.05 

0.05 

21 

0.01 

0.01 

0.13 

0.01 

0.01 

0.17 

0.56 

0.05 

0.05 

22 

0.01 

0.01 

0.13 

0.01 

0.01 

0.05 

0.68 

0.05 

0.05 

23 

0.01 

0.01 

0.13 

0.01 

0.01 

0.05 

0.56 

0.17 

0.05 

24 

0.01 

0.01 

0.13 

0.01 

0.01 

0.05 

0.56 

0.05 

0.17 

25 

0.01 

0.01 

0.01 

0.25 

0.01 

0.05 

0.56 

0.05 

0.05 

26 

0.01 

0.01 

0.01 

0.13 

0.13 

0.05 

0.56 

0.05 

0.05 

27 

0.01 

0.01 

0.01 

0.13 

0.01 

0.17 

0.56 

0.05 

0.05 

28 

0.01 

0.01 

0.01 

0.13 

0.01 

0.05 

0.68 

0.05 

0.05 

29 

0.01 

0.01 

0.01 

0.13 

0.01 

0.05 

0.56 

0.17 

0.05 

30 

0.01 

0.01 

0.01 

0.13 

0.01 

0.05 

0.56 

0.05 

0.17 

31 

0.01 

0.01 

0.01 

0.01 

0.25 

0.05 

0.56 

0.05 

0.05 

32 

0.01 

0.01 

0.01 

0.01 

0.13 

0.17 

0.56 

0.05 

0.05 

33 

0.01 

0.01 

0.01 

0.01 

0.13 

0.05 

0.68 

0.05 

0.05 

34 

0.01 

0.01 

0.01 

0.01 

0.13 

0.05 

0.56 

0.17 

0.05 

35 

0.01 

0.01 

0.01 

0.01 

0.13 

0.05 

0.56 

0.05 

0.17 

36 

0.01 

0.01 

0.01 

0.01 

0.01 

0.29 

0.56 

0.05 

0.05 

37 

0.01 

0.01 

0.01 

0.01 

0.01 

0.17 

0.68 

0.05 

0.05 

38 

0.01 

0.01 

0.01 

0.01 

0.01 

0.17 

0.56 

0.17 

0.05 

39 

0.01 

0.01 

0.01 

0.01 

0.01 

0.17 

0.56 

0.05 

0.17 

40 

0.01 

0.01 

0.01 

0.01 

0.01 

0.05 

0.80 

0.05 

0.05 

Table  3.59  (continued) 
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N 

x, 

x2 

x3 

x4  x5 

x6 

x7 

x8 

x9 

41 

0.01 

0.01 

0.01 

0.01  0.01 

0.05 

0.68 

0.17 

0.05 

42 

0.01 

0.01 

0.01 

0.01  0.01 

0.05 

0.68 

0.05 

0.17 

43 

0.01 

0.01 

0.01 

0.01  0.01 

0.05 

0.56 

0.29 

0.05 

44 

0.01 

0.01 

0.01 

0.01  0.01 

0.05 

0.56 

0.17 

0.17 

45 

0.01 

0.01 

0.01 

0.01  0.01 

0.05 

0.56 

0.05 

0.29 

46 

0.0367 

0.0367 

0.0367  0.0367  0.0367  0.0767  0.5867 

0.0767 

0.0767 

47 

0.0367 

0.0367 

0.0367  0.0367  0.0367  0.0767  0.5867 

0.0767 

0.0767 

48 

0.0367 

0.0367 

0.0367  0.0367  0.0367  0.0767  0.5867 

0.0767 

0.0767 

Table  3.60 

Measured  properties 

p 

<V 

°p 

E 

Es 

£ 

Price 

N 

g/cm3 

10'3  psi 

10 3 psi 

10"3  psi 

10'3  psi 

10"3  psi 

% 

+ /I 

1 

1.524 

1.790 

5.066 

6.992 

96.772 

75.821 

6.25 

46.03 

2 

1.540 

2.174 

5.746 

6.995 

144.436 

98.033 

3.50 

45.05 

3 

1.255 

1.992 

5.158 

6.744 

150.662 

200.548 

3.75 

50.93 

4 

1.218 

1.810 

4.812 

7.759 

104.285 

73.772 

4.75 

47.84 

5 

1.561 

2.547 

6.320 

6.841 

192.286 

238.093 

2.75 

46.10 

6 

1.370 

2.619 

6.527 

8.425 

176.678 

255.389 

2.83 

45.56 

7 

1.353 

1.742 

5.350 

9.087 

76.219 

119.019 

13.00 

52.82 

8 

1.375 

2.740 

6.457 

10.320 

170.477 

203.992 

3.16 

51.47 

9 

1.323 

2.647 

7.008 

11.265 

158.520 

196.839 

4.50 

47.44 

10 

1.556 

1.756 

4.930 

5.053 

120.088 

178.962 

4.75 

44.06 

11 

1.271 

1.495 

4.193 

5.049 

105.821 

144.552 

5.50 

49.94 

12 

1.234 

1.326 

3.922 

5.826 

68.542 

92.949 

8.83 

46.86 

13 

1.571 

2.012 

5.710 

5.072 

149.286 

106.484 

3.50 

45.12 

14 

1.360 

2.093 

5.640 

6.273 

123.925 

90.966 

4.25 

44.58 

15 

1.369 

1.354 

4.322 

7.250 

57.920 

84.523 

13.30 

51.84 

16 

1.391 

2.676 

6.291 

8.583 

172.846 

212.616 

3.50 

50.48 

17 

1.339 

2.391 

5.931 

9.483 

155.229 

84.923 

4.00 

46.45 

18 

0.986 

1.055 

3.020 

6.851 

68.198 

87.952 

10.00 

55.82 

19 

0.950 

0.979 

3.111 

7.875 

49.265 

39.920 

12.70 

52.74 

20 

1.293 

1.529 

4.401 

4.697 

117.360 

89.910 

4.25 

51.00 

21 

1.102 

1.579 

4.645 

6.073 

101.706 

76.399 

5.50 

50.46 

22 

1.085 

1.178 

3.822 

9.195 

49.636 

93.970 

20.25 

57.72 

23 

1.107 

2.134 

5.855 

8.791 

140.909 

171.469 

5.00 

56.36 

24 

1.055 

1.951 

5.627 

10.742 

113.217 

149.987 

8.16 

52.33 

25 

0.913 

0.831 

2.405 

7.880 

28.853 

39.518 

16.70 

49.66 

26 

1.255 

1.551 

4.641 

4.123 

88.557 

145.391 

7.10 

47.92 

27 

1.065 

1.362 

3.995 

6.659 

60.385 

47.749 

9.25 

47.37 
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P 

oz- 

°s 

Op 

E 

Es 

£ 

Price 

N 

g/cm3 

10'3  psi 

10  3 psi 

10"3  psi 

10'3  psi 

10"3  psi 

% 

*/l 

28 

1.047 

1.170 

3.550 

9.325 

36.676 

22.903 

20.00 

54.64 

29 

1.070 

2.074 

5.458 

9.936 

98.425 

117.795 

6.10 

53.28 

30 

1.018 

2.096 

5.961 

14.849 

101.368 

121.118 

11.70 

49.25 

31 

1.599 

1.701 

6.842 

5.523 

113.891 

92.042 

4.75 

46.18 

32 

1.408 

2.061 

5.786 

6.140 

151.090 

233.562 

2.75 

45.63 

33 

1.390 

1.488 

4.923 

7.655 

69.508 

127.688 

11.75 

52.90 

34 

1.413 

2.648 

6.893 

8.795 

188.910 

238.930 

2.75 

51.54 

35 

1.361 

2.683 

6.832 

8.820 

177.011 

228.251 

3.83 

47.51 

36 

1.218 

2.230 

4.952 

7.444 

125.650 

167.862 

4.75 

45.09 

37 

1.200 

1.742 

5.723 

8.605 

65.118 

118.473 

15.67 

52.35 

38 

1.221 

2.967 

7.502 

11.180 

168.391 

209.106 

3.25 

51.00 

39 

1.170 

2.837 

7.476 

12.713 

147.005 

187.590 

6.58 

46.96 

40 

1.181 

1.376 

4.225 

13.302 

35.650 

47.539 

28.10 

59.62 

41 

1.204 

2.465 

6.813 

16.235 

106.444 

121.308 

9.83 

58.26 

42 

1.152 

2.156 

7.116 

10.277 

93.620 

127.054 

24.30 

54.23 

43 

1.266 

3.671 

8.489 

20.086 

183.699 

233.242 

3.00 

56.90 

44 

1.174 

3.725 

8.810 

17.021 

184.266 

188.491 

5.17 

52.87 

45 

1.122 

3.467 

8.203 

17.576 

163.933 

205.205 

7.50 

48.84 

46 

1.259 

2.222 

5.554 

8.330 

138.991 

81.023 

4.50 

50.25 

47 

1.259 

2.101 

5.449 

9.419 

130.262 

78.496 

5.50 

50.25 

48 

1.259 

2.089 

5.560 

9.391 

133.316 

78.339 

5.25 

50.25 

Table  3.61 

Regression 

coefficients 

Regression 

coefficients 

p 

£ 

°P 

oz- 

«s 

Es 

E 

B0 

1.13249 

-114.19733  0.07269  0.01064 

0.01384 

0.56932 

0.4745 

B, 

1.37649 

95.98950  -0.02732  0.00078  - 

0.00283 

0.14554 

1.79251 

b2 

1.57042 

167.01782  -0.05128  -0.00615 

0.01668 

-5.75500 

0.70308 

b3 

-1.02582 

-14.61224  -0.04799  -0.00967  - 

0.020115 

-1.22441 

-0.29696 

b4 

-1.10084 

109.44629  -0.03456  -0.00979  - 

0.02067 

0.22113 

-1.35077 

b5 

1.58401 

160.08569  -0.06056  -0.00244  - 

0.01579 

1.72925 

1.58983 

b6 

0.27504 

82.41577  -0.03130  -0.00383 

0.00103 

-0.63911 

0.50197 

b7 

0.08755 

290.07031  -0.12926  -0.01465 

0.00056 

-0.33234 

-0.64604 

b8 

0.34422 

256.43359  -0.22232  -0.00569  - 

0.02062 

-1.35132 

0.43842 

Bn 

0.0 

182.80933  -0.05150  -0.02704  - 

0.04074 

-2.97189 

-3.16558 

B22 

0.0 

39.53931  0.00961  -0.00091 

0.01035 

8.39853 

-0.30142 

B33 

0.0 

84.78394  0.02581  0.00139  - 

0.00693 

0.72176 

-0.41278 

B44 

0.0 

137.30420  -0.00937  -0.00172  - 

0.01476 

1.06233 

0.57664 

Table  3.61  (continued) 
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Regression 

coefficients 

P 

£ 

°P 

Es 

E 

b55 

0.0 

119.84521 

0.05115 

-0.01412 

0.03872 

-4.55200 

-2.13765 

Bft6 

0.0 

25.01221 

-0.02219 

-0.00174 

-0.04392 

0.89334 

-0.21522 

b77 

0.0 

-113.98730 

0.07174 

0.00431 

-0.01337 

-0.23637 

0.16125 

®88 

0.0 

37.80859 

0.21372 

0.00680 

0.01492 

3.09570 

-0.41095 

BU 

0.0 

0.0 

0.0 

0.0 

0.0 

1.93091 

0.0 

Bl3 

0.0 

0.0 

0.0 

0.0 

0.0 

4.53545 

0.0 

Bl4 

0.0 

0.0 

0.0 

0.0 

0.0 

-2.24657 

0.0 

Bis 

0.0 

0.0 

0.0 

0.0 

0.0 

1.72719 

0.0 

Bl6 

0.0 

0.0 

0.0 

0.0 

0.0 

5.74762 

0.0 

Bi7 

0.0 

-270.89063 

0.0 

0.0 

0.0 

-0.56160 

-2.24062 

Bi8 

0.0 

0.0 

0.0 

0.0 

0.0 

2.11084 

0.0 

B23 

0.0 

0.0 

0.0 

0.0 

0.0 

8.43297 

0.0 

B24 

0.0 

0.0 

0.0 

0.0 

0.0 

6.87164 

0.0 

b25 

0.0 

0.0 

0.0 

0.0 

0.0 

0.37309 

0.0 

B26 

0.0 

0.0 

0.0 

0.0 

0.0 

2.11751 

0.0 

b27 

0.0 

-342.40527 

0.0 

0.0 

0.0 

4.83197 

-1.45135 

B28 

0.0 

0.0 

0.0 

0.0 

0.0 

10.49878 

0.0 

B34 

0.0 

0.0 

0.0 

0.0 

0.0 

-1.32475 

0.0 

B35 

0.0 

0.0 

0.0 

0.0 

0.0 

-5.29202 

0.0 

B36 

0.0 

0.0 

0.0 

0.0 

0.0 

-3.41042 

0.0 

b37 

0.0 

0.0 

0.0 

0.0 

0.0 

0.98099 

0.0 

B38 

0.0 

0.0 

0.0 

0.0 

0.0 

3.12524 

0.0 

B45 

0.0 

0.0 

0.0 

0.0 

0.0 

0.58264 

0.0 

B46 

0.0 

0.0 

0.0 

0.0 

0.0 

-3.37820 

0.0 

B47 

0.0 

-196.06665 

0.0 

0.0 

0.0 

-1.92389 

1.13045 

B48 

0.0 

0.0 

0.0 

0.0 

0.0 

1.41870 

0.0 

B56 

0.0 

0.0 

0.0 

0.0 

0.0 

2.08801 

0.0 

b57 

0.0 

-372.80078 

0.0 

0.0 

0.0 

2.08035 

-2.25305 

B58 

0.0 

61.21167 

0.00910 

-0.01003 

-0.00587 

2.39307 

0.276081 

Bt7 

0.0 

-188.13574 

0.0 

0.0 

0.0 

0.07410 

-1.06024 

B68 

0.0 

0.0 

0.0 

0.0 

0.0 

3.12915 

0.0 

B78 

0.0 

-520.73828 

0.29166 

2.00905 

0.02934 

0.20239 

-0.40131 
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Example  II  [14] 

When  changing  technological  parameters  in  the  process  of  refining  of  raw  benzene, 
composition  of  waste  liquid  is  also  changed.  Waste  liquid  is  often  used  for  obtaining 
expensive  materials  characterized  by  their  density  and  viscosity.  Water  content  in 
such  a liquid  varies  from  0 to  15%,  and  the  contents  of  ashes  does  not  exceed  10%. 
The  research  objective  is  to  obtain  a regression  model  that  will  adequately  describe 
density  and  viscosity  in  the  function  of  proportions  of  organic  matter,  water  and 
ashes.  Since  proportions  of  components  are  limited,  the  experiment  has  been  done 
by  extreme  vertices  design.  The  local  factor  space  is  given  in  Table  3.62,  and  design 
matrix  with  outcomes  of  trials  in  Table  3.63. 

Table  3.62  Local  factor  space 


Components 

Proportions  of  components 

Xl 

*2 

x3 

x, 

95 

0 

5 

x2 

80 

15 

5 

x3 

90 

0 

10 

Table  3.63  Extreme  vertices  design 


Response 

Mark 

Design  matrix 

Density  g/cm3 

Viscosity  p 

x. 

X2 

x3 

Pi 

P2 

P 

Vl 

v2 

V 

yi 

1 

0 

0 

1.128 

1.126 

1.127 

17.1 

16.6 

16.85 

Y2 

0 

1 

0 

1.070 

1.072 

1.071 

4.0 

5.0 

4.50 

Yi 

0 

0 

1 

1.139 

1.136 

1.1375 

21.0 

19.1 

20.05 

Y12 

0.5 

0.5 

0 

1.112 

1.114 

1.113 

12.6 

12.8 

12.70 

Yu 

0.5 

0 

0.5 

1.135 

1.130 

1.1325 

18.6 

17.9 

18.25 

Y23 

0 

0.5 

0.5 

1.118 

1.120 

1.119 

14.6 

14.8 

14.70 

yi23* 

1/3 

1/3 

1/3 

1.122 

1.123 

1.1225 

15.0 

15.6 

15.30 

The  geometrical  interpretation  of  local  factor  space  is  given  in  Fig.  3.38. 
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0.2 


organic  matter 
Figure  3.38  Local  factor  space 


Connections  between  real  and  coded  factors  are  given  by  these  relations: 

x,=0.95X3  + 0.8X2  + X3 

x2=0.15X2 

x3=0.05Xi  + 0.05X2  +0.1X3  (3.157) 

By  calculating  regression  coefficients,  these  second-order  regression  models  have 
been  obtained: 


p = 1.127X!  + 1.071X2  + I.I38X3  + 0.046X3X3  - 0.002X3X3  + 0.058X2X3 

(3.158) 

v = 16.85Xj  + 4.5X2  + 20.05X3  + 8.1XjX2  + 97X3X3  - 0.8X2X3  (3.159) 

To  check  lack  of  fit  of  the  obtained  regression  models,  errors  of  the  experiment 
have  been  calculated: 

Sp  = 3.6  x 10  6 ; Sv  = 0.422;  / = 7. 


A check  of  lack  of  fit  has  been  done  in  control  point  y123.  Calculated  values  of  den- 
sity and  viscosity  in  control  points  have  by  Eqs.  (3.158)  and  (3.159)  these  values, 
respectively  are:  p123=1.123  and  v123=15.69. 


Ap 

Av 


123 

123 


p 123  P 123 
Vl23  - V123 


0.0005^/2 

~ 1.276\/3.6xl0~6 


|1.1225  - 1.1230|  = 0.0005; 


1 15 .3  - 15.69|  = 0.39 


0.292;  tv 


0.39\/2 

1.276x70422 


0.655. 


Since  the  tabular  value  is  t(o.o25;7)=2.3646,  calculated  values  tR  are  smaller,  so  that 
regression  models  are  adequate.  The  geometric  interpretation  of  regression  models 
as  contour  lines  and  in  coded  factors  is  given  in  Figs.  3.39  and  3.40. 
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X2 

A 


1.090 


Figure  3.40  Density  contour  lines 


Example  III  [29] 

The  research  objective  has  been  to  define  the  durability  of  a coating  depending  on 
mixture  composition  Ni-Cr-B.  Besides,  one  had  to  determine  the  optimal  composi- 
tion of  the  given  three-component  mixture.  Since  there  is  a linear  correlation  be- 
tween resistance  on  wear-out  and  hardness  of  coating,  Rockwell  hardness  (HRC) 
has  been  chosen  as  the  system  response.  Based  on  preliminary  information,  it  is 
known  that  the  response  surface  is  smooth  and  continuous.  Hence,  it  may  be 
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approximated  by  a lower-order  polynomial.  The  design  of  experiment  matrix  with 
outcomes  is  given  in  Table  3.64  (trials  1 to  6). 

The  second-order  regression  model  for  hardness  by  Rockwell  has  this  form: 

y1  = 22Xx  + 35X2  + 51X3  + 38X,  X2  + 34X,  X3  + 52X2X3  (3.160) 

Lack  of  fit  of  the  obtained  regression  model  has  been  checked  in  trials  7,  10,  11, 

12  and  13.  Control  trials  have  been  chosen  under  the  assumption  that  they  are  in 
the  optimum  zone  and  may  simultaneously  be  used  for  a model  augmenting  to  a 
higher-order  regression  model.  The  analyzed  regression  model  has  not  been  ade- 
quate in  control  design  points.  To  obtain  an  incomplete  cube  model,  trial  No.  7 has 
been  used  to  calculate  regression  coefficients.  The  regression  model  of  incomplete 
third  order  has  this  form: 

y2  = 22Xx  + 35X2  + 51X3  + 38X^2  + 34X3X3  + 52X2X3  + 222X1X2X3  (3.161) 

Lack  of  fit  of  the  obtained  model  has  been  checked  by  trials  8,  9,  10,  14  and  15. 
Regression  model  (3.161)  has  not  been  adequate  in  chosen  control  trials.  To  calcu- 
late regression  coefficients  for  a third-order  model,  it  has  been  necessary  to  include 
these  trials:  1,  2,  3 and  7,  8,  9,  10,  11,  12  and  13.  The  third-order  regression  model 
is: 

y3  = 22Xx  + 35X2  + 51X3  + 11.25X^2  + 38.25XjX3  + 58.50X2X3 

+15.75X1X2(X1  - X2)  + 51.75X1X3(X1  - X3)  + 22.50X2X3(X2  - X3) 
+270X^3X3  (3.162) 

Lack  of  fit  of  the  obtained  regression  model  has  been  checked  in  control  trials  4, 

5,  6 and  14,  15,  16,  17  and  18.  Three  trials  16,  17  and  18  he  inside  concentration 
triangle.  The  obtained  third-order  regression  model  has  been  adequate  in  all  control 
trials.  For  a three-component  mixture  it  is  easiest  to  determine  the  optimum  from 
geometric  interpretation  of  the  regression  model.  The  contour  lines  of  regression 
model  (3.162)  are  shown  in  Fig.  3.41. 
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Table  3.64  Design  matrix  for  a three-component  mixture 


N 

Design  matrix 

HRC 

Augmenting  of  models 

X, 

x2 

x3 

II  order  model 

Incomplete  III 

III  order 

1 

1 

0 

0 

22 

+ 

+ 

+ 

2 

0 

1 

0 

35 

+ 

+ 

+ 

3 

0 

0 

1 

51 

+ 

+ 

+ 

4 

0.5 

0.5 

0 

33 

+ 

+ 

check  of  lack 

5 

0.5 

0 

0.5 

45 

+ 

+ 

of  fit 

6 

0 

0.5 

0.5 

56 

+ 

+ 

7 

0.333 

0.333 

0.333 

58 

check  of  lack 

+ 

+ 

of  fit 

8 

0.666 

0.333 

0 

30 

check  of  lack 

+ 

9 

0.333 

0.666 

0 

32 

of  fit 

+ 

10 

0 

0.666 

0.333 

55 

check  of  lack 

+ 

11 

0 

0.333 

0.666 

57 

of  fit 

+ 

12 

0.666 

0 

0.333 

44 

+ 

13 

0.333 

0 

0.666 

46 

+ 

14 

0.75 

0.25 

0 

29 

check  of  lack 

check  of  lack 

15 

0.25 

0.75 

0 

32 

of  fit 

of  fit 

16 

0.50 

0.25 

0.25 

54 

17 

0.25 

0.50 

0.25 

58 

18 

0.25 

0.25 

0.50 

56 

Figure  3.41  Contour  lines  of  regression  model  Eq.  (3.162) 
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The  figure  clearly  shows  that  the  optimum  is  in  this  factor  space:  Nie  (20-30  %); 

Cre  (30-50  %);  Be  (30-40  %). 

It  should  be  noted  that  by  including  trials  No.  16,  17  and  18  one  may  calculate 
regression  coefficients  for  a fourth  order  regression  model. 
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A.l 

Answers  to  Selected  Problems 


Chapter  I 

1.1 


Sample 

Values 

Averages 

Variances 

1 

3.4;  3.9;  3.9; 

9.9;  3.5 

5.7 

8.4 

2 

5.5;  8.5;  6.7; 

0.7;  4.5 

5.2 

4.8 

3 

1.5;  0.6;  3.3; 

4.2;  5.7 

3.6 

4.9 

4 

2.2;  8.2;  9.4; 

0.4;  9.9 

4.9 

12.3 

5 

8.4;  5.0;  0.8; 

6.6;  3.1 

4.1 

9.2 

6 

9.9;  9.3;  5.1; 

4.8;  9.2 

5.9 

10.5 

7 

6.5;  4.0;  9.2; 

4.6;  2.1 

3.9 

7.4 

8 

1.8;  3.6;  5.2; 

9.4;  2.9 

4.9 

8.9 

9 

7.5;  1.7;  5.6; 

3.3;  8.0 

4.5 

7.1 

10 

4.2;  5.1;  2.3; 

5.8;  2.6 

3.8 

4.8 

11 

6.8;  8.2;  3.6; 

2.8;  8.3 

5.4 

6.9 

12 

6.1;  6.0;  2.8; 

6.9;  9.1 

4.8 

12.1 

13 

3.5;  8.2;  6.5; 

3.2;  2.9 

4.5 

6.5 

14 

OJ 

05 

VI 

00 

05 

8.8;  2.3 

6.0 

6.2 

15 

5.2;  4.7;  8.8; 

2.4;  7.4 

5.1 

5.2 

16 

0.9;  0.0;  4.7; 

4.4;  5.9 

4.2 

11.9 

17 

7.5;  8.0;  6.8; 

3.0;  0.0 

3.7 

12.2 

18 

6.8;  2.2;  0.2; 

5.9;  7.0 

4.1 

10.9 

19 

9.4;  0.3;  4.0; 

9.8;  0.9 

4.6 

15.1 

20 

8.6;  9.9;  1.4; 

5.7;  6.4 

5.9 

6.3 

Averages: 

4.7 

8.5 

Population 

parameters: 

4.5 

8.3 

Design  of  Experiments  in  Chemical  Engineering.  Zivorad  R.  Lazic 
Copyright  © 2004  WILEY-VCH  Verlag  GmbH  & Co.  KGaA,  Weinheim 
ISBN:  3-527-31142-4 
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1.2 

a)  P(Z<1.4)-P(Z<00)=0. 9192-0. 5000=0. 4192 

b)  P(Z<0)-P(Z<-0.78)=P(Z<0.78)-P(Z<0)=0.7823-0.500=0.2823 

c)  P(Z<1.9)-P(Z<-0.24)=0.9713-[1-P(Z<0.24)]=0. 9713-0. 4052=0. 5661 

d)  P(Z<1.96)-P(Z<0.75)=0. 9750-0. 7734=0. 2016 

e)  P(-oo<Z<0.44)=0.6700 

f)  P(-oo<Z<1.2)=0.8849 

g)  P(Z<1)-P(Z<-1)=0.8413-[1-P(Z<1)]=0. 8413-0.1587=0. 6826 

h)  P(Z<1.96)-P(Z<-1.96)=0.975-[l-P(Z<1.96)=0.975-0.025=0.9500 

i)  P(Z<2.58)-P(Z<-2.58)=0.9951-0.9949=0.9902 

1.3 

a)  By  Table  B,  Z1  lies  between  1.40  and  1.41  so  that  by  linear  interpolation  we 
get  Z1=1.405. 

b)  Z^-1.3733. 

1.4 

2 = 77  = 26~2°  = 2;  P(Z  < 2)  = 0.97725 

a x/9  v ~ ; 

=>  P(Z  y 2)  = 1 - P(Z  < 2)  = 1 - 0.97725  = 0.02275 

Hence,  2.28%  of  population  may  be  expected  to  have  values  above  26. 


1.5 


0.88; 

b)  0.128; 

c)  0.829; 

0.4649; 

b)  0.2684; 

c)  0.0401 

435; 

b)  92; 

90.5%; 

b)  0.0588; 

n=13; 

b)  n=4; 

c)  n=4; 

d)  0.2266; 


1.10 


o 


X 


a 

\/n 


7xl0~5 

—V7— 


= 2.646  x 10  5; 


Zl-a/2  = Z0.975  = 1 X = 12.36  x 10'  ; 

12.36  x 10-5  - 1.96  x 2.646  x 10-5  -<  p ^ 12.36  x 10 
7.17  x 10“5  -<  n -<  17.55  x 10 


+ 1.96  x 2.646  x 10 


•5 
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1.11 

X = 65.5  [ BTU/HRFT 2 °f] ; Sx  = 4.347  [. BTU/HRFT2°F ] ; 1 - a = 0.99; 

a=0.01;  a/2=0.005;rt=10-l=9;  tg.0.995=3.2  5 0 
65.5-3.25  x 4.347/\/10<K<65.5+3.25  x A347/V10 
61.03<K<69.97 

Minimal  value  with  99%  confidence  is:  61.03  [BTU/HR  FT2  °K] 


1.12 


Xx=15.770;  X2=15.597; 


Zi-„/2=Z0.975=1.96;  Oj=a2=0.016; 


Xi  - x2  + Za/2  x 


ll2  -<  Xx  - X2  + Zx^a/2  x 


15.770-15.597-1.96 


0.016 

3 


+1.96 


0.016  0.016 


0.5 


0.016\ui  ic77niccQ7 

H — 1 <|_ix-[i,2<15. 770-15. 597 

; -0.029  < |il  - [x2  < 0.375 


1.13 

Replicated  measurements  will  be  treated  as  individual  observations: 

Xx  = 5.37;  X2  = 5.67;  Sx  = 0.08425;  Si  = 0.07325. 

Assume  that  ax  = a2 . 

Sl  . . 2X0.0845+2X0.07325  _ 0.28062 

p rtx+rt2-2  3+3-2  v 

Since  l-a=0.95  and  nx=3;  n2=3,  or  f=nx+n2-2=4  we  get  from  the  table  for  t-distri- 
bution  that:  t4;o.975=2.776. 

5.37-5.67-2.776  x 0.28062 r|t  2<5.37-5.67+2.776  x 0.28062 
—0.9361  < jy  < 0.2361 

Since  a 95%  confidence  interval  has  0=p,x-|t2,  which  means  that  |xx=|x2,  the  cata- 
lysts do  not  differ  in  efficiency. 


1.14 

X =5.88;  Sx  =0.00975;  n=5;  k=n-l=4;  a=0.05; 

y2  = 0 484- y2  = 11  1 

M;0.025  ’^4;0.975 

4 x 0.00975  < a2  < 4 x 0.00975 
11.1  - - 0.484 

0.00351  < a2  < 0.08057 
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1.15 


Xl  = 310.8423  [ CJ 


k1=n1-l=12; 

Pl2;12;0.99=4.16. 


Si  = 1.1867; 
k2=n2-l=12; 


X2  = 310.5246  [°C] ; 
a=0.02; 


11867  x 0.241  < % < 1-1867  x 4.16 
“ a2  _ 1-5757 


1.5757 


0.18104  <%<  3.13300 
o2 


The  associated  98%  confidence  interval  for  o 2/o  2 is: 


Si  = 1.5757; 

Fl2;12;0.01=0.241; 


0.42548<a  2/o  2<1.7700 

Since  the  98%  confidence  interval  includes  the  value  l=o  1/o  2 one  may  not  state 
that  there  is  a statistically  significant  difference  between  a2  and  o2. 


1.16 

X=2.50  x 1011;  S2=l.llxl020;  n=10;  a=0.10;  t9i0.05=-1.83;  t9;0.95=1.83. 

X+t“/2 

2.50  x 10n-1.83  x 1.05  x 10lo/vW  <2.50  x 10u+1.83  x 1.05  x 10w/^/n 
2.446  x 10n<g,  <2.556  x 1011 


1.17 

X=11.411;  Sx=3.5551;  Z0.025=-l-96;  10.24<X<12.58  [mg/m3] 


1.18 


£ X = 25.15;  X = 22.0680;  X = [T]  X/n  = 0.838; 


si  , =>  si  = , 0.03393; 


n(n—  1) 


30(30-1) 


Sx=0.18420;  f=n-l=29;  a=0.01;  t29.0.oo5=-2.76;  t29;0.995=2.76 


sfh 


X + Ja/2  X ^ - ll  - X + ^l— a/2 

0.838-2.76  x 0.18420/v'30<ii<0.838+2.76  x 0.18420/^ 
0.745<n<0.931  [%Si] 
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1.19 

X1=8.57;  Sj=17.95;  n x=7;  X2=11.89;  S2=15.61;  n2=9; 

f =14.46;  tf;1_a/2=-1.601;  XrX2=2.073. 

1.20 

1.  Assume  normal  distribution 

2.  H0:  p>26.5;  Hx:  |x<26.5 

3.  Since  a2  is  unknown  we  use  the  next  test  statistic: 


r x-n 

Sjsfn 

4.  Let  a=0.05; 

5.  T-has  Student’s  distribution  with  f=n-l=ll-l=10  degrees  of  freedom. 

6.  X=26. 03568;  S/vTT=6.741  x 10'5 


26.03568-26.5 
6.741  xl0~5 


0.46432 

6.741 


x 105 


= -6.888  x 103 


7.  t10;o.95=1.81  from  Table  C;  Hence: 

T=-6.888  x 103<t10;o.95=1.81  so  that  H0  is  rejected  or  we  may  say  that  p does  not 
exceed  26.5.  A chance  to  make  an  error  in  this  statement  is  5 to  100  or  a=0.05. 


1.21 


X=54.76;  S2=4.216;  S=2.053; 

„ 54.76-55.0 

“ 2.053/\/50  “ _ ' ’ t49;0'01 


a=0.01; 
= -2.704 


H0  with  99%  confidence  is  not  rejected. 


f=n-l=49 


1.22 

1.  Assume  a normal  distribution 

2.  Ho:  [ix=[r  2;  Hx:  [iiA  p2 

3.  Since  we  know  population  variance  ox2=o  2 =0.016  the  test  statistic  is: 

X.-X2-fu  -u ) 15.770— 15.597— (u-u) 

Z = — ,vx  1 - 2J  = I?-  = 1.675 

a x . /— +—  v/0.016.  /-+- 

V »i  n2  V 3 3 

4.  a=0.05 

5.  Z has  normal  distribution. 

6.  Z=1.675 

7.  Zl  ay2=1.96;  Za/2=-1.96  since:  -1.96<1.675<1.96  the  null  hypothesis  pi=[t  2 is 

accepted,  or  both  bottles  are  with  the  same  HC1. 
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1.23 

Xi=14.280;  Si  =0.028;  n1=4;  X 2=14 .37' 4; S2  =0.004;  n2=3 


Xj-X2  _ 14.374—14.280 

Si  | s) 

»1  n2 


T=1.0330; 


a=0.05;  t5:o.o5=2.01;  T=1.0330<t5;o.o5=2.01 

The  means  of  burning  rates  are  not  statistically  different  with  95%  confidence. 


Si  =0.028;  m=4;  S2  =0.004;  n2=3;  k1=3;  k2=2 
F*  = % = = 7.00; 


^ 0.004 


5(3-2)  = ■ 


0 05  W3-2) 

0.05  ^Fr  = 7.00  -<  19.2 

We  accept  H0  with  95%  confidence  that  there  exist  statistically  significant  differ- 
ices  between  variances  of  these  two  samples. 


-i-  = 0.05 
19.2 


ences 


1.25 

Xi=14.280;  Si  =0.0280;  nj=4;  X2=15.681;  S2=0.0024;n2=4 


T = 


X -X  15.681-14.280 


T=16.0665; 


4; 


a=0.05;  t4.0.05=2.13  T=16.0665>t4;005=2.3 
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It  is  asserted  with  95%  confidence  that  there  are  statistically  significant  differ- 
ences in  effects  of  the  two  catalyst  types  on  burning  rate. 

1.26 


Source  of  varia- 
tions 

f 

SS 

MS 

F 

Fl;8;0.95 

Factor  A 

1 

0.146 

0.146 

5.84 

5.32 

Factor  B 

1 

0.002 

0.002 

0.08 

5.32 

Factor  C 

1 

0.300 

0.300 

12.00 

5.32 

Interaction  AB 

1 

0.000 

0.000 

0.00 

5.32 

Interaction  AC 

1 

0.104 

0.104 

4.16 

5.32 

Interaction  BC 

1 

0.253 

0.253 

10.12 

5.32 

Interaction  ABC 

1 

0.041 

0.041 

1.64 

5.32 

Error 

8 

0.200 

0.025 

- 

- 

Total 

15 

1.046 

- 

- 

- 

1.27 

a)  Sx  = 24.596;  Sp  = 1.808;  F = 13.6  >-  F3:20; 0.95  = 3.10 

b)  t20;0.95=2*086j  36.522<[x  m<38.812 

1.28 

Sx  = 5.1566;  Sp  = 0.89391;  F = 5.77  >-  F3.36;095  = 2.85 


1.29 

Sx  = 0.0014167;  Sp  = 0.0007458;  F = 1.899  -<  F3;12;0 99  = 5.95 
Sp  = 0.01362;  h2-0  995  = 3.055;  = 0.385  ± 3.055  x 0.01362 

1.30 

SSC=5, 653, 437.5;  SSR=170, 343, 939.6;  SSE=7, 306, 879.2; 

SSt=310,933,275;  pressure  does  not  effect  the  moment  but  it  does  the  actuator 
model. 
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1.31 


Source  of  variation 

f 

SS 

MS 

F 

Fl;12;0.95 

Factor  methods 

1 

0.00001 

0.00001 

34.61335 

4.75 

Factor  mixing 

1 

0.00003 

0.00003 

112.23636 

4.75 

Interaction 

1 

0.00000 

0.00000 

0.19043 

4.75 

Error 

12 

0.00000 

0.00000 

- 

- 

Total 

15 

0.00004 

- 

- 

- 

1.32 

Source  of  variation 

F 

SS 

MS 

F 

Fl;24;0.95 

Cutting  tools  (T) 

1 

2.82 

2.82 

1.21 

4.26 

Cutting  angle  (B) 

1 

20.32 

20.32 

8.72 

4.26 

Type  of  cutting  (C) 

1 

31.01 

31.01 

13.31 

4.26 

Interaction  T x B 

1 

0.20 

0.20 

0.09 

4.26 

Interaction  T x C 

1 

0.01 

0.01 

0.004 

4.26 

Interaction  B x C 

1 

0.94 

0.94 

0.40 

4.26 

Interaction  T x B x C 

1 

0.19 

0.19 

0.08 

4.26 

Error 

24 

53.44 

2.33 

- 

- 

Total 

31 

108.93 

- 

- 

- 

1.33 

Source  of  variation 

f 

SS 

MS 

F 

Ftab 

Samples 

11 

93.486 

8.499 

7.691 

Fll;22;0.95=2.26 

Kivets 

2 

14.527 

7.263 

6.573 

F2;22;0.95=3'44 

Bottles 

1 

0.347 

0.347 

0.314 

Fl;22;0.95=4-30 

Samples  x kivets 

22 

22.806 

1.037 

0.938 

F22;22;0.95=2-05 

Samples  x bottles 

11 

27.153 

2.468 

2.233 

Fu;22;0.95=2-26 

Kivets  x bottles 

2 

1.695 

0.847 

0.767 

F2;22;0.95=3.44 

Samples  x kivets  x 

22 

24.305 

1.105* 

- 

- 

bottles 

Total 

71 

184.319 

- 

- 

- 

* Three-factor  interaction  variance  is  used  for  estimation  of  residual  variance 


1.34 

The  experimental  program  has  been  done  with  no  trial  replication,  so  that  residual 
variance  must  be  determined  based  on  variance  interactions.  Whether  one  may 
neglect  all  interactions  or  some  of  the  variance  interactions  may  be  included  in  the 
residual  variance  since  its  effect  is  significant  and  different  from  others,  is  checked 
against  Bartlett ’s  criterion.  Assuming  that  variance  interactions  are  negligible,  we  have: 
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Source  of  variation 

f 

SS 

MS 

F 

Fl;4;0.95 

Factor  A 

1 

3.125 

3.125 

1.87 

7.71 

Factor  B 

1 

47.045 

47.045 

28.17 

7.71 

Factor  C 

1 

0.720 

0.720 

0.43 

7.71 

Interaction  AB 

1 

0.720 

0.720 

- 

- 

Interaction  AC 

1 

0.045 

0.045 

- 

- 

Interaction  BC 

1 

0.405 

0.405  1.670 

- 

- 

Interaction  ABC 

1 

0.500 

0.500 

- 

- 

Total 

7 

52.560 

- 

- 

1.35 

Source  of  variation 

f 

SS 

MS 

F 

Ftab 

Between  batches 

8 

792.88 

99.11 

132.15 

P8;27;0.95=2-31 

Between  samples 

2 

4.21 

2.11 

2.81 

F2;27;0.95=3-35 

Interaction  batches  x samples 

16 

21.09 

1.32 

1.76 

Fl6;27;0.95=2.02 

Error 

27 

20.17 

0.75 

- 

- 

Total 

53 

838.35 

- 

- 

- 

1.36 

Source  of  variation 

f 

SS 

MS 

F 

Ftab 

Between  chalks 

1 

0.503040 

0.503040 

62880.00 

F l;44;0.95=4-04 

Between  laboratories 

10 

0.005223 

0.000522 

65.25 

Fl0;44;0.95=2-03 

Interaction  laboratory  x chalks 

10 

0.000478 

0.000048 

6.00 

Fl0;44;0.95=2-03 

Error 

44 

0.000363 

0.000008 

- 

- 

Total 

65 

0.509104 

- 

- 

- 

1.37 

Source  of  variation 

f 

SS 

MS 

F 

Ftab 

Factor  A 

4 

478  463 

119  615.75 

373.53 

F4;24;0.95=2-78 

Factor  B 

2 

52  794 

26  397.00 

82.41 

F2;24;0.95=3-40 

Factor  C 

3 

150  239 

50  079.67 

156.34 

F3;24;0.95=3.01 

Interaction  AB 

8 

16  807 

2 100.88 

6.56 

F8;24;0.95=2.36 

Interaction  AC 

12 

53  890 

4 490.83 

14.02 

Fl2;24;0.95=2-18 

Interaction  BC 

6 

6 416 

1 069.33 

3.34 

F6;24;0.95=2-51 

Interaction  ABC 

24 

7 688 

320.33 

Total 

59 

766  297 

- 

- 

- 

Since  experiments  have  been  done  with  no  replications,  the  residual  variance  has 
been  determined  based  on  the  variance  interaction  ABC,  as  it  has  been  proved  by 
theoretical  analysis  that  it  may  be  neglected. 
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1.38 


Source  of  variation 

f 

SS 

MS 

F 

Ftab 

Between  locations 

3 

128.74 

42.91 

13.41 

F3;105;0.95=2.70 

Between  alloys 

8 

128.18 

16.02 

5.01 

F8;105;0.95=2-04 

Between  researchers 

3 

106.24 

35.41 

11.07 

F3;105;0.95=2-70 

Location  x alloy 

24 

89.32 

3.72 

1.16 

F24;105;0.95=l-65 

Location  x researcher 

9 

12.73 

1.41 

- 

- 

Researcher  x alloy 

24 

23.32 

0.97  3.20 

- 

- 

Location  x alloy  x researcher 

72 

58.96 

0.82 

- 

- 

Total 

143 

547.49 

- 

- 

- 

Since  an  analysis  of  the  three-factor  variance  is  with  no  replications,  for  an  esti- 
mate of  residual  variance  one  may  pool  variances  of  the  three  last  interactions,  as 
they  cannot  be  significant  theoretically. 

1.39 

Fflow=165.707;  Fconc=7.607;  MSE=1. 90276  (pooled  AB  error). 

At  95%  confidence,  catalyst  flow  and  concentration  have  a statistically  significant 
effect  on  conversion  of  vinegar  acid. 


1.40 

SSC=39, 934.1875;  SSR=324, 082.1875;  SSE=9, 232.0625 

MSC=13, 311.3958;  MSR=108, 027.3958;  MSE=1025.7847; 
Fc=12.97679>F3;9;o.99=6.99;Fr=105.31196>F3;9;o.99=6.99; 

Interaction  AB  pooled  in  residual  variance  with  error  variance;  it  is  not  extracted 
as  there  have  been  no  trials  replications. 

1.41 

SSadh=6274.307;  SSpra=4.380;  effect  of  a preliminary  surface  preparation  is 
insignificant,  while  the  effect  of  adhesion  systems  is  significant. 

Fadh  =150.422;  FPRA  =0.315;  Fadh.xpra=19.304; 

Fl;40;0.95=4.08;  F3;40;0.95-2.84. 

1.42 

SSpra=9.275;  SSAdh.=13,761.231;  SSPRAxadh.=266.21. 

All  effects  except  surface  preparation  with  a primer  are  significant  with  95%  con- 
fidence. 

FPra.=0-83442;  Fadh  =412.66618;  FPRAxADH  =7.98297; 

Fl;40;0.95=4.08;  F3.40i0.95=2.84. 
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1.43 


Source  of  variation 

f 

ss 

MS 

F 

^TAB 

Adhesive  x (A) 

3 

17  721.30 

5 907.10 

472.20 

F 3;80;0.95=2  - 70 

Preparation  x (B) 

1 

13.20 

13.20 

1.06 

Fl;80;0.95=3.96 

Thickness  x (C) 

1 

4 256.01 

4 256.01 

340.22 

Fi;80;0.95=3-96 

Interaction  A x B 

3 

951.79 

317.26 

25.36 

F3;80;0.95=2.70 

Interaction  A x C 

3 

2 314.24 

771.41 

61.67 

F3;80;0.95=2.70 

Interaction  B x C 

1 

0.45 

0.45 

0.04 

Fi;80;0.95=3-96 

Interaction  A x B x C 

3 

119.64 

39.88 

3.19 

F 3;80;0.95=2  - 70 

Error 

80 

1 000.78 

12.51 

- 

- 

Total 

95 

26  377.41 

- 

- 

- 

1.44 

Source  of  variation 

f 

SS 

MS 

F 

Fl;8;0.95 

Type  of  catalyst 

1 

5.752 

5.752 

410.857 

5.32 

Batch  of  AP 

1 

0.024 

0.024 

1.714 

5.32 

Interaction 

1 

0.023 

0.023 

1.643 

5.32 

Error 

8 

0.110 

0.014 

- 

- 

Total 

11 

5.909 

- 

- 

- 

1.45 

^eks.  = 5-2;  X 

= 4.39; 

Sx 

= 0.0481; 

Sx 

= 0.2193; 

N = 13; 

f = 12; 

AXmax  = 5.2  — 

4.39  = 0.81; 

fo  = /: 

ii 

H-O 

II 

1 © 
So 
+ II 

1.80; 

~\  0.5 


13C2  xll 


12 


= 1.80;  C « 1.59 


AXmax  = C x Sx  = 1.59  x 0.2193  = 0.35 


AXmax  = 0.81  >-  0.35;  X = 5.2  outlier 
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1.48 

XEKS  = 14.531; X = 14.196;  Sx  = 9.350  x 10~S;  Sx  = 0.003;  N = 4;/  = N — 1 = 3; 
AXmax  = XMAX  - X = 14.531  - 14.196  = 0.335;/0  = 0;  4o.05  = 4s  = 2.92; 


AXmax  = C x Sx  = 1.35  x 0.003  = 0.004 
AXmax  = 0.335  >-  0.004;  X = 14.531  outlier 


1.49 

a)  b0=9.27;  b1=1.44;  ?=9.27+1.44X 


k)  Source  of  variation  f 

ss 

MS 

F 

Fl;9;0.95 

Regression 

Residual 

1 

9 

226.95 

21.23 

226.95 

2.36 

96.17 

5.12 

Corrected  total 

10 

248.18 

- 

- 

- 

c)  1.11<P  !<  1.77 

d)  12.15<[x<15.03 

e)  12.15<(t<15.03 

1.50 

II  II 

6.61<(i<23.89(X=-2) 

Source  of  variation 

f 

SS 

MS 

F 

F 5;6;0.95 

Regression 

1 

3 293.77 

3 293.77 

- 

- 

Residual 

11 

102.85 

9.35 

- 

- 

Lack  of  fit 

5 

91.08 

18.22 

9.30* 

4.39 

Pure  error 

6 

11.77 

1.96 

- 

- 

Total 

12 

3 396.62 

- 

- 

- 

Y=129.7872-24.0199X 

1.51 

X=  6668.4;  Y = 8662.6;  Y = 1360.0  + 1.0951X 

1.52 

X=  751.875;  Y = 316.75;  Y = 4654.9846  - 0.4498X 
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1.53 

X=  2165.000;  Y = 2125.278;  Y = 15.352  + 0.974561X 


Source  of  variation 

f 

SS 

MS 

F 

Fl;16;0.95 

Regression 

1 

4 772  451.7 

4 772  451.7 

1 501.01 

4.49 

Residual 

16 

50  871.9 

3 179.49 

- 

- 

Total 

17 

4 823  323.6 

- 

- 

- 

1.54 

Source  of  variation 

f 

SS 

MS 

F 

^4;6;0.95 

Regression 

1 

52.50 

52.50 

- 

- 

Residual 

10 

17.17 

1.717 

- 

- 

Lack  of  fit 

4 

5.50 

1.375 

0.706 

4.53 

Pure  error 

6 

11.67 

1.945 

- 

- 

Total 

11 

69.67 

- 

- 

- 

a)  Y = -21. 33+5. OX  b)  2.984<fi  !<  7.016 


1.55 

b0=0. 353332;  b1=3.34292  x 10'4 

Y=  0.353332+3.34292  x 10‘4  x t 

1.56 

Y=45. 1972-2. 68408X1+4.20910X2;  r2=0.94808 


1.57 

a)  X=14. 7075-1. 2042Z1-0.46288Z2;  b)  r2=0.82523 

1.58 

Y!=3.09204+0.193694X+0.0362872X2;  r2=0.99956; 

Y2=2.41588+0.180067X+0.034122X2;  r2=0.998707; 

1.59 

Y=3. 73926+1. 7717X-0.0601562X2;  r2=0.991516 
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1.60 

a)  7=2.5372000-0.0004718X  b)  Regression  model  is  adequate 


Source  of  variation 

f 

SS 

MS 

F 

Ftab 

Regression 

1 

0.110395 

0.110395 

14.50 

Fl;13;0.95=4.67 

Residual 

13 

0.098938 

0.007611 

- 

- 

Lack  of  fit 

5 

0.018938 

0.003788 

0.38 

F5;8;0.95=3-69 

Pure  error 

8 

0.080000 

0.010000 

- 

- 

Total 

14 

0.209333 

- 

- 

- 

c)  95  % confidence  interval  ofY  for: 

X=0;  X=X;  X=400;  X=460. 

1.  X=0  (t±2.160  x 0.527=(t±1.138 

2.  X=X  (t±2.160  x 0.022=}i±0.048 

3.  X=400  (t±2.160  x 0.039=(t±0.084 

4.  X=460  (t±2.160  x 0.048=(t±0.104. 
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2.1 

S = 1650. 

— = 0.738; 

16(1728  - 12) - 4 x 156 

From  Table  D for  a=0.05  and  f=12-l=ll  we  get  y(),=19.75.  The  null  hypothesis  in 
accord  of  researchers'  opinions  is  accepted.  Histogram  of  rank  sum  is  given  in  the 
figure. 


Based  on  Table  2.14: 

4 

Y TJ  = 156;  T = 26; 
1 

_ 12  x 1650 


X5  X9  X10  X6  X7  Xll 


X2  X3  X4  X8 


Histogram  of  ranks 


XI 2 


XI 


It  is  evident  from  the  diagram  that  we  have  here  an  even  nonmonotonous  curve 
of  ranks,  so  the  suggestion  here  is  to  take  the  eight  most  significant  factors  as  basis 
of  the  next  phase  of  experimental  studies. 

2.2 

co  = 0.526 


2.3 

_ 12x14665 

182(ll3-ll) -18x946 

Xt(0.01;10)  — 23.2 


0.43; 


2 

Xr 


18(11  - 1)  x 0.43  = 77.4; 


2.4 

co  = 0.012;  xR  = 4.9. 

2.5 

Screened  factors  are  X1;X2;X3;X4;X5  and  X6. 

The  regression  has  this  form: 

Y=67.4+10.7X1+3.9X2-7.0X3+3.45X4+1.2X5+3.8X6 

Estimated  response  values  are  by  regression  given  in  Table  2.40. 


2.6 


-3.05; 

% 

EXlx2 

= -13.30; 

EXy 

eX2X} 

= -5.02; 

E*s 

Ex3x4 

= -2.75; 

Ex, 

EXlx, 

= 2.00; 

64.90; 

-0.97; 

4.72; 

-1.25; 

1.00; 


Finally,  we  screened  out  factors 
with  two  levels. 


first  screening 

Ex&  =3  .12;  second  screening; 

EX}  =4  .25;  third  screening; 

fourth  screening; 
fifth  screening. 

X3;X2  and  X3.  Factors  X4  and  X8  are  qualitative 


2.7 

SSC  = (275625. 0+240100. 0+334084.0+295936. 0)/(3  x 3)-4566769.0/(3  x 3 x 4)=> 

SSC  = 127305.00-126854.69=450.31 

SSR  = (355216.0+528529.0+662596.0) /(4  x 3)-126854.69^ 

SSR  = 128861.75-126854.69=2007.06 

SSCR  = (23104+18496+25281+22201+31684+27225+37636+36100+ 

+ 38025+35721+50625+42025)/3-127305. 0-128861. 75+126854.69 
SSCR  = 62.27;  SSE=129995.00-126854.69=3140.31 
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Source  of  variation 

f 

SS 

MS 

F 

Ftab 

Between  blocks 

3 

450.31 

150.10 

14.46 

F3;6;0.95=4-76 

Between  levels  of 
factors 

2 

2 007.06 

1 003.53 

96.68 

F2;6;0.95=5.14 

Experimental  error 

6 

62.27 

10.38 

- 

- 

Sampling  error 

24 

3 140.31 

130.85 

- 

- 

Total 

35 

5 659.95 

- 

- 

- 

2.8 

SSC  = (30625. 00+26676. 69+37121. 73+32880. 57)/3-(507414.03)/(3  x 4) 

SSC  = 42434.66-42284.50=150.16 

SSR  = (39469.77+58723.83+73619.97) /4-42284.50 

SSR  = 668.89 

SSE  = 43124.30-42284.50-150.16-668.89=20.75 


Source  of  variation 

f 

SS 

MS 

F 

Ftab 

Between  blocks 

3 

150.16 

50.05 

14.47 

F3;6;0.95=4-76 

Between  levels  of 

2 

668.89 

334.45 

96.66 

F2;6;0.95=5-14 

factors 

Experimental  error 

6 

20.75 

3.46 

- 

- 

Total 

11 

839.80 

- 

- 

- 

2.9 

Source  of  variation 

f 

SS 

MS 

F 

Ftab 

Type  of  mixing 

6 

6 317 

1 053 

2.33 

p6;30;0.90=l-98 

Active  matter 

6 

99  157 

16  526 

36.64 

F6;30;0.99=T17 

Dissolvent  type 

6 

5 487 

915 

2.02 

F6;30;0.90=1'98 

Experimental  error 

30 

13  541 

451 

- 

- 

Total 

48 

124  502 

- 

- 

- 

2.10 

The  missing  data  is 

estimated  and  it  is  90.  Analysis  of  variance  offered  these  results: 

Source  of  variation 

f 

SS 

MS 

F 

Ftab 

Nozzles  in  operation 

3 

379.19 

126.397 

2.250 

- 

Air  flow 

3 

86.19 

28.73 

0.511 

- 

Water  pressure 

3 

101.19 

33.73  (32.17)** 

0.573 

- 

Experimental  error 

5* 

280.87 

56.174 

- 

- 

Total 

15* 

847.44 

- 

- 

- 

* One  degree  of  freedom  less  because  of  estimation  one  data 

**  Corrected  because  of  estimation 
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No  factor  is  significant  for  the  efficiency  of  scrubber,  which  may  be  explained 
either  that  real  factors  have  not  been  included  in  the  research  or  that  variation  inter- 
vals of  analyzed  factors  are  too  small. 

2.n 


Source  of  variation 

f 

ss 

MS 

F 

^ 4;1 2;0.95 

Stators 

4 

1 302.8 

325.7 

27.07 

3.26 

Rotors 

4 

36.4 

9.10 

0.76 

3.26 

Quality  of  insulation 

4 

52.4 

13.10 

1.09 

3.26 

Experimental  error 

12 

144.4 

12.03 

- 

- 

Total 

24 

1 536.0 

- 

- 

- 

2.12 

SSR  = [ll2  + 182 

+ 212 

+ 192 

)/4  - 802/16  = 

= 2.50 

SSC  = (l82  + 172 

+ 262 

+ 192 

)/4  - 802/16 

= 12.50 

SSL  = (262  + 212 

+ 172 

+ 162 

)/4-  802/16  : 

= 15.5 

SSG  = (l32  + 202 

+ 172 

+ 202 

V4  - 802/16 

= 4.5 

SSE  = 446  - 400  - 

- 2.50 

- 12.50-  15.5-4.5 

= 11.00 

SST  = 446  - 400  = 

= 46.00 

Source  of  variation 

f 

SS 

MS 

F 

F 3;3;0.95 

Between  rows 

3 

2.50 

0.83 

0.23 

9.28 

Between  columns 

3 

12.50 

4.17 

1.14 

9.28 

Between  latin  letters 

3 

15.50 

5.17 

1.41 

9.28 

Between  numbers 

3 

4.50 

1.50 

0.41 

9.28 

Experimental  error 

3 

11.00 

3.67 

- 

- 

Total 

15 

46.00 

- 

- 

- 

2.13 

Source  of  variation 

f 

SS 

MS 

F 

F 4;8;0.95 

Temperature 

4 

15  100.56  3 

775.14 

26.29 

21.4 

Time 

4 

3 195.36 

798.84 

5.56 

21.4 

Reactor 

4 

200.76 

50.19 

0.35 

21.4 

Operator 

4 

254.16 

63.54 

0.44 

21.4 

Experimental  error 

8 

1 148.92 

143.62 

- 

- 

Total 

24 

19  899.76 

- 

- 

- 

584 


Appendix 


2.14 


Source  of  variation 

f 

SS 

MS 

F 

Ftab 

Cycles 

- 

288.92 

- 

- 

- 

Corrected-cycles 

3 

147.25 

49.08 

45.44 

F3;3;0.95=9-28 

Corrected-dyestuff 

3 

164.58 

54.86 

50.80 

p3;3;0.95=9-28 

Positions 

2 

8.17 

4.09 

3.79 

^2;3;0.95=9-55 

Residual 

3 

3.25 

1.08 

- 

- 

Total 

11 

464.92 

- 

- 

- 

2.15 


Source  of  variation 

f 

SS 

MS 

F 

Ftab 

Corrected  conditions 

6 

0.57 

0.10 

1.25 

^6;12;0.95=3.0 

Corrected  producer 

6 

1.06 

0.18 

2.25 

F6;12;0.95=l-0 

Producer 

- 

0.99 

- 

- 

- 

Type  of  saw 

3 

0.04 

0.01 

0.13 

F3;12;0.95=3-49 

Residual 

12 

0.96 

0.08 

- 

- 

Total 

27 

2.56 

- 

- 

- 

SSj  = (V.502  + 8.142  + 7.262  + 6.772  + 8.082  + 6.842  + 9.032)/ 

4-  53.62/(7  x 4)  = 0.99 

552  = [(4  x 7.52  - 7.50  - 8.14  - 7.26  - 8.08)2  + ... 

7x4  (4—1)  L 

...  + (4  x 8.69  - 7.50  - 8.14  - 6.77  - 9.03)2]  = 0.57 

553  = (7.522  + 7.242  + 7.992  + 7.412  + 6.942  + 7.832  + 8.692)  /4  - 102.68  = 0.50 

554  = 7x7/(4-1)  [(4  X 7-5°  “ 7’52  “ 7'41  " 7'83  _ 8-69)2+- 

...  + (4  x 9.03  - 7.99  - 6.94  - 7.83  - 8.69)2]  = 1.06 

555  = [(2.30  + 1.64  + 1.92  + 2.14  + 1.65  + 1.64  + 1.86)2+... 

...  + (1.62  + 1.93  + 2.22  + 1.58  + 1.61  + 1.74  + 2.67)2]  /7  - 102.68  = 0.04 
SST  = (l.622  + 2.102  + 1.502  + 2.302  + ...  + 1.862  + 2.622)  - 102.68  = 2.56 
SSE  = 2.56  - 0.99  - 0.57  - 0.04  = 0.96 
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2.16 

b0  = 22.48  x 10  3 ; fox  = -1.01  x 10“3;  b2  = 19.46  x 1(T3 

b3  = 1.45  x 10  3 ; b4  = -1.05  x 10  3;  b12  = 2.24  xlO~3; 

bn  = -1.32  x 10  3;  b14  = 1.80  x 1(T3. 

2.17 

ya  = 1.94+  0.17Xx  - 0.39X2  - 0.32X3  + 0.10X4  - 0.08XxX3  - 0.09X2X3 
+0.09X2X4  +0.07X3Xs 

yh  = 0.54  + 0.12X2  + 0.16X3  - 0.09X4  - 0.18XS  + 0.07XxX3  - 0.06XxX4 

— 0.06XxX5  + 0.095X2X3  - 0.08X2X4  - 0.06X3X4  - 0.11X3X5  + 0.05X4X5 

2.18 

y = 47.1  + 2.78Xx  - 1.29X2  + 3.23X3  + 1.55X4  + 1.35XS  - 4.9X6 

2.19 

y = 4.5250  + 0.1312Xx  - 0.0187X2  + 0.4812X3  - 0.0937X4  + 0.0625XxX2 
— 0.0625XxX3  - 0.05X2X3 

2.20 


y 

Regression  coefficients 

b0 

b7 

b2 

b3 

b4 

b5 

be 

b7 

b12 

bu 

bi4 

b15 

b16 

b17 

b27 

bi27 

yi 

59.1 

-3.4 

-7.4 

0.0 

-4.2 

1.7 

3.1 

1.0 

1.1 

0.6 

-1.9 

-2.2 

-1.2 

-0.4 

-0.4 

-0.5 

y 2 

8.11 

-2.19 

-3.77 

1.07 

-1.54 

2.78 

-0.25 

-0.42 

0.34 

1.16 

-0.55 

-1.47 

-0.87 

0.82 

-0.16 

-0.34 

ys 

46.3 

-8.1 

-14.1 

4.0 

-6.0 

10.1 

-2.0 

-1.8 

0.6 

4.5 

-1.9 

-5.1 

-2.2 

2.9 

-0.5 

-1.0 

y< 

115 

-6 

-11 

-7 

-6 

10 

-6 

-2 

-10 

2 

1 

0 

7 

-5 

-3 

1 

ys 

31.2 

-13.8 

-22.5 

8.6 

-12.7 

0 

0.6 

1.6 

7.9 

6.0 

-4.7 

2.2 

-1.4 

-0.8 

0.3 

-2.4 

Yg 

7.9 

-0.7 

0.0 

0.0 

-0.4 

0.2 

0.1 

0.7 

-0.4 

-0.1 

0.1 

0.2 

0.3 

-0.1 

0.4 

0.2 

Y7 

109 

2 

-1 

6 

0.0 

-2 

0 

5 

-5 

0 

2 

-4 

-1 

3 

4 

0 

ys 

9.73 

-0.12 

-0.08 

-0.07 

0.36 

-0.27 

0.07 

0.48 

-0.22 

-0.04 

-0.14 

0.43 

0.10 

0.06 

-0.07 

0.56 

Y9 

1.09 

-0.17 

0.25 

-0.12 

0.30 

0.18 

-0.19 

0.13 

-0.19 

0.17 

0.01 

-0.14 

0.12 

-0.20  0.20 

0.05 

yio 

16.4 

5.9 

-3.2 

-2.8 

0.0 

2.3 

-0.1 

-0.8 

0.7 

-2.0 

0.3 

-0.7 

-1.2 

0.3 

1.1 

1.2 

yn 

1.096 

0.016 

0.001 

-0.031 

-0.001  0.002 

0.002 

0.002 

0.001 

-0.007  0.002 

-0.001  0.001 

0.003  0.002 

-0.001 

2.21 

y = -4.2482  + 3.60625Xx  + 0.77565X2  + 0.7814Xx  + 0.0588X2  + 0.00072498XxX2 
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2.22 

y = 0.2993  - 0.0829Xx  + 0.0738X,  - 0.149X, 

b0  - y0  =0.2993-(0. 1224+0. 1382+0.1204+0. 0943+0. 0698+0. 1135)/6  =0.1896 
y = 0.10975  + 0.0816Xx  + 0.0753X2  - 0.1483X3  - 0.6043XxX2  + 0.0345XxX3 
-0.0229X2X3  + 0.0359Xx  + 0.0321X2  + 0.1224X3 

2.23 

y = 353.00  - 145.07Xx  - 192.77X2  + 133.70X3  + 6.00XxX2  - 41.13XxX3 
— 52.25X2X3  + 41.76Xx  + 51.35X2  - 2.65X3 

2.24 

b0  = 0.22006;  b1  = -0.0195;  b2  = 0.002169;  h3  = 0.025928; 
h4  = -0.10837;  bs  = -0.00446;  foxx  = 0.04943;  bu  = -0.00375; 

bu  = -0.056249;  b1A  = 0.3387;  bis  = -0.04199;  b22  = 0.03688; 

b2i  = 0.00399;  b24  = 0.008625;  b2S  = -0.010499;  fe33  = 0.0296; 

b}4  = -0.0165;  b35  = 0.071625;  b44  = 0.0340;  b45  = 0.018249; 

b5S  = 0.06427. 

2.25 

b0  = 4.114;  fax  = 0.508;  b2  = 1.021;  b3  = -1.192;  b4  = -1.625; 

bn  = 0.043;  b12  = -0.212;  b13  = -0.025;  b14  = -0.719;  b22  = 0.249; 

h23  = 0.212;  b24  = 0.245;  b33  = 0.718;  b34  = 0.212;  bu  = 0.456. 

y=H/H0 

H0-height  of  non-fluidized  bed; 

H-  height  of  fluidized  bed. 

2.26 

y = 93  + 20Xx  - 30X2  - 58X3  + 9XxX2  + 11X2X3  - 22XxX3 
y = 50  + 16Xx  - 30X2  - 63X3  + 11X2  + 29X3  + 9XxX2  + 11X2X3  - 22XxX3 

2.27 


y 

^0 

bn 

b2 

b3 

bn 

b22 

b33 

bi2 

b13 

^23 

yi 

139.12 

16.49 

17.88 

10.91 

-4.01 

-3.45 

-1.57 

5.13 

7.13 

7.88 

Yi 

1 261.11 

268.15 

246.50 

139.48 

-83.55 

-124.79 

199.17 

69.38 

94.13 

104.38 

Yi 

400.38 

-99.67 

-31.40 

-73.92 

7.93 

17.31 

0.43 

8.75 

6.25 

1.25 

y 4 

68.91 

-1.41 

4.32 

1.63 

1.56 

0.06 

-0.32 

-1.63 

0.13 

-0.25 

2.28 
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Name 

Factors 

Response 

x. 

X2 

x3 

x4 

Yui 

Xu2 

Yus 

yU4 

y 

Y 

Basic  level 

1.5 

4.25 

54.0 

9.5 

Variation  intervals 

1.0 

0.5 

12.0 

5.0 

Regression  coefficients  b; 

0.131 

-0.019 

0.481 

-0.094 

Product  bi  x 8; 

0.131 

- 

5.780 

- 

Basic  step  ha 

- 

- 

12.0 

- 

Proportional  calc. 

0.273 

- 

12.0 

- 

Rounded  up  step 

0.3 

- 

12.0 

- 

Abstract  trials 

1 

2.8 

3.75 

78 

90 

5.75 

6.10 

5.49 

5.90 

5.81 

5.041 

2 

3.1 

3.75 

90 

90 

6.10 

4.50 

6.60 

6.40 

5.90 

- 

3 

3.4 

3.75 

102 

90 

5.62 

6.30 

6.25 

5.62 

5.95 

6.073 

4 

3.7 

3.75 

114 

90 

6.40 

7.00 

6.00 

5.50 

6.22 

6.589 

5 

4.0 

3.75 

126 

90 

5.60 

6.90 

6.35 

5.75 

6.15 

7.105 

6 

4.3 

3.75 

138 

90 

6.10 

6.60 

6.75 

6.75 

6.55 

7.621 

7 

4.6 

3.75 

150 

90 

6.65 

6.35 

5.90 

6.75 

6.41 

8.130 

8 

4.9 

3.75 

162 

90 

5.65 

5.45 

5.75 

6.00 

5.71 

- 

2.29 


Name 

Xl 

x2 

x3 

Response 

Basic  level 

1.500 

1.100 

3.900 

Variation  interval 

0.300 

0.200 

0.900 

Regression  coefficients 

0.00987 

-0.00275 

-0.00225 

Product  bj  x Axj 

0.002961 

-0.00055 

-0.002025 

Step  in  change  xj  for  0.36 

+0.36 

-0.067 

-0.246 

Rounded  up  step 

+0.36 

-0.07 

-0.25 

Trials 

Xr 

x2 

x3 

yu 

? 

1 

1.86 

1.03 

3.65 

0.259 

0.275 

2 

2.22 

0.96 

3.40 

0.249 

0.261 

3 

2.58 

0.89 

3.15 

0.247 

0.248 

4 

2.94 

0.82 

2.90 

0.245 

0.235 

5 

3.30 

0.75 

2.65 

0.243 

0.221 

6 

3.66 

0.68 

2.40 

0.242 

0.208 

7 

4.02 

0.61 

2.15 

0.244 

0.194 

8 

4.38 

0.54 

1.90 

0.246 

0.181 
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Basic  level 
Lower  level 
Upper  level 
Variation  interval 

3.66 

3.16 

4.16 
0.50 

0.68 

0.38 

0.98 

0.30 

2.40 

1.50 

3.30 

0.90 

No. 

X» 

Xj 

x2 

x3 

x32 

x22 

x32 

x,x2 

X,x3 

X2x3 

Yu 

y 

1 

+ 

- 

- 

- 

+ 

+ 

+ 

+ 

+ 

+ 

0.258 

0.257 

2 

+ 

+ 

- 

- 

+ 

+ 

+ 

- 

- 

+ 

0.280 

0.276 

3 

+ 

- 

+ 

- 

+ 

+ 

+ 

- 

+ 

- 

0.271 

0.275 

4 

+ 

+ 

+ 

- 

+ 

+ 

+ 

+ 

- 

- 

0.272 

0.276 

5 

+ 

- 

- 

+ 

+ 

+ 

+ 

+ 

- 

- 

0.247 

0.244 

6 

+ 

+ 

- 

+ 

+ 

+ 

+ 

- 

+ 

- 

0.255 

0.252 

7 

+ 

- 

+ 

+ 

+ 

+ 

+ 

- 

- 

+ 

0.253 

0.257 

8 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

+ 

0.246 

0.248 

9 

+ 

-1.682 

0 

0 

2.828 

0 

0 

0 

0 

0 

0.267 

0.265 

10 

+ 

1.682 

0 

0 

2.828 

0 

0 

0 

0 

0 

0.271 

0.279 

11 

+ 

0 

-1.682 

0 

0 

2.828 

0 

0 

0 

0 

0.251 

0.246 

12 

+ 

0 

1.682 

0 

0 

2.828 

0 

0 

0 

0 

0.253 

0.257 

13 

+ 

0 

0 

-1.682 

0 

0 

2.828 

0 

0 

0 

0.278 

0.276 

14 

+ 

0 

0 

1.682 

0 

0 

2.828 

0 

0 

0 

0.241 

0.241 

15 

+ 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0.242 

0.242 

16 

+ 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0.232 

0.242 

17 

+ 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0.235 

0.242 

18 

+ 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0.243 

0.242 

19 

+ 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0.251 

0.242 

20 

+ 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0.249 

0.242 

Regression  coefficients  have  the  values: 

b0  = 0.242;  \ = 0.0023;  b2  = 0.00347;  fo3  = -0.01026; 

bn  = 0.00944;  b22  = 0.0033;  bn  = 0.00587;  b12  = -0.00429; 

bu  = -0.00261;  b2i  = -0.001. 

2.30 

y - 463.53  = 25.21Z?  + 76.17Z2 

Zj  = 0.995(Xj  - 1.40)  +0.099(X2  - 0.55) 

Z2  = 0.099(Xj  - 1.40)  - 0.995(X2  - 0.55) 

2.31 

y-  52.12  = 0.35Zi  - 1.85Z2 

2.32 

yx  - 22.5  = 12.5Zi  + 6.9 Z2 
y2  - 64.6=  1.8 zj  - 11.1Z2 
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Tables  of  Statistical  Functions 


Table  A Random  numbers 


83  28 

78  05 

18  98 

49  22 

5411 

92  37 

45  11 

63  60 

19  05 

91  26 

84  73 

82  58 

01  90 

55  37 

85  68 

98  15 

99  52 

99  84 

51  91 

73  81 

00  79 

20  99 

42  57 

55  67 

93  39 

99  25 

65  10 

91  54 

84  65 

16  23 

94  48 

02  99 

71  08 

50  84 

66  10 

10  34 

92  30 

89  28 

30  74 

24  24 

54  37 

52  43 

87  22 

21  34 

2015 

07  67 

64  98 

36  01 

33  34 

04  42 

47  68 

59  90 

98  90 

27  71 

89  89 

98  20 

24  19 

85  02 

34  38 

26  71 

76  16 

58  55 

51  85 

44  00 

28  28 

38  91 

70  70 

16  81 

13  49 

46  54 

37  64 

90  35 

64  45 

47  72 

82  03 

01  65 

05  97 

13  90 

90  57 

51  97 

92  78 

39  12 

48  01 

83  46 

39  29 

98  71 

39  56 

97  66 

97  70 

05  77 

24  50 

29  02 

71  28 

53  99 

75  07 

13  18 

76  97 

72  54 

85  79 

71  60 

01  72 

71  23 

86  40 

70  05 

35  36 

15  64 

11  01 

11  18 

9014 

95  05 

43  28 

52  77 

22  80 

49  89 

79  65 

91  17 

80  94 

34  02 

17  61 

00  42 

29  09 

19  54 

67  67 

88  54 

62  09 

07  97 

35  19 

31  25 

06  92 

25  02 

27  95 

74  89 

62  45 

75  39 

06  89 

58  96 

64  65 

81  84 

85  20 

01  47 

52  43 

54  97 

75  80 

00  38 

20  38 

57  46 

57  33 

87  19 

66  06 

40  32 

78  11 

60  42 

09  83 

28  40 

93  57 

61  22 

27  27 

47  80 

44  34 

47  27 

03  74 

36  27 

13  19 

14  76 

35  73 

66  29 

95  65 

12  87 

61  91 

34  30 

82  25 

35  57 

16  29 

21  27 

51  23 

06  52 

40  00 

28  11 

47  23 

63  01 

09  91 

87  20 

33  76 

61  55 

79  21 

74  36 

21  36 

05  47 

28  42 

92  51 

19  82 

00  40 

15  52 

45  35 

13  48 

7410 

97  36 

22  85 

44  57 

91  72 

69  41 

17  07 

11  54 

36  81 

57  38 

55  39 

85  74 

48  05 

06  43 

10  63 

48  80 

36  26 

28  95 

03  79 

54  31 

41  55 

48  84 

78  63 

09  05 

69  07 

80  02 

51  78 

94  07 

88  62 

85  82 

80  37 

56  15 

59  30 

46  42 

84  02 

19  51 

95  22 

72  72 

95  51 

57  73 

04  68 

00  95 

04  30 

66  52 

60  74 

50  36 

31  76 

75  39 

04  95 

69  47 

95  23 

01  70 

95  04 

0418 

68  14 

60  03 

34  57 

41  76 

35  06 

75  60 

21  58 

86  36 

02  33 

00  59 

63  13 

59  40 

60  83 

61  73 

45  18 

08  23 

54  86 

64  57 

76  70 

00  89 

43  24 

29  51 

12  43 

14  24 

35  78 

76  22 

82  50 

68  02 

13  19 

07  00 

19  07 

57  07 

34  86 

57  96 

99  57 

44  54 

90  87 

33  76 

71  71 

23  28 

88  37 

81  73 

29  08 

96  62 

34  26 

52  32 

23  74 

17  49 

45  62 

17  88 

50  50 

40  20 

21  54 

17  65 

99  31 

09  72 

67  87 

16  34 

00  76 

26  23 

42  40 

81  26 

86  30 

79  17 

93  45 

74  50 

50  24 

65  52 

06  59 

04  60 

73  63 

13  65 

31  57 

36  88 

98  35 

04  96 

41  37 

45  87 

57  57 

21  15 

34  59 

23  41 

47  66 

24  73 

31  96 

72  07 

09  43 

88  63 

33  80 

54  79 

8418 

79  62 

53  27 

85  43 

51  69 

83  81 

90  85 

84  72 

18  48 

41  20 

81  59 

13  40 

75  73 

19  92 

12  01 

91  95 

23  99 

99  30 

30  58 

46  22 

64  41 

54  87 

97  55 

83  91 

42  61 

41  02 

40  18 

39  20 

56  19 

56  35 

04  32 

09  29 

30  63 

75  86 

85  29 

15  34 

68  92 

34  06 

81  60 

32  16 

05  37 

61  99 

27  99 

73  18 

94  29 

25  74 

22  20 

70  46 

30  38 

26  91 

59  16 

31  84 

93  27 

40  23 

25  86 

68  30 

10  11 

91  59 

61  07 

41  97 

10  39 

35  86 

11  25 

98  38 

2714 

79  68 

77  60 

63  34 

23  80 

75  43 

48  79 

40  42 

68  85 

23  40 

27  56 

54  56 

75  65 

70  49 

24  08 

10  44 

75  59 

25  14 

94  00 

99  80 

8144 

49  08 

98  93 

71  74 

11  14 

54  69 

71  69 

56  18 

75  63 

56  68 

25  36 

75  98 

00  18 

19  15 

24  28 

56  80 

75  97 

79  61 

54  67 

58  38 

93  69 

45  95 

61  19 

17  35 

89  90 

98  70 

26  20 

92  91 

85  49 

33  32 

46  67 

28  20 

40  99 

88  73 

56  33 

29  13 

41  89 

01  79 

85  45 

45  36 

05  67 

56  17 

59  77 

59  34 

35  01 

15  21 

00  35 

55  84 

71  36 

40  39 

47  25 

25  73 

69  14 

55  73 

35  86 

61  17 

98  69 

38  36 

66  66 

19  40 

90  83 

06  31 

24  67 

91  74 

5414 

87  24 

61  80 

01  69 

50  70 

31  02 

98  86 

42  01 

94  98 

07  85 

28  38 

37  30 

72  76 
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Table  A (continued) 


87  08 

83  09 

40  14 

39  15 

99  24 

21  85 

00  45 

54  19 

36  18 

03  88 

88  33 

78  20 

40  40 

24  73 

77  70 

00  31 

84  59 

26  06 

50  30 

95  96 

22  50 

09  11 

00  37 

36  51 

55  95 

83  97 

13  75 

46  22 

77  50 

11  72 

48  70 

56  57 

16  24 

21  74 

91  53 

18  05 

59  61 

74  97 

31  82 

77  68 

93  45 

40  93 

12  80 

88  63 

26  93 

85  05 

19  87 

84  37 

59  76 

16  65 

50  76 

72  02 

39  19 

40  69 

57  23 

09  33 

20  70 

86  45 

13  94 

98  39 

91  64 

01  34 

67  13 

11  00 

32  09 

39  76 

21  64 

29  85 

65  14 

51  74 

33  20 

63  71 

95  94 

13  77 

12  44 

12  94 

91  04 

41  83 

79  72 

44  08 

90  59 

65  46 

78  82 

16  45 

97  85 

57  75 

79  96 

79  08 

16  83 

43  99 

05  10 

93  57 

80  32 

86  65 

26  90 

27  54 

34  94 

46  33 

65  35 

56  84 

92  85 

63  26 

69  69 

81  54 

70  56 

17  62 

43  17 

86  78 

99  62 

3415 

08  50 

36  55 

82  11 

26  54 

76  88 

85  67 

82  21 

65  00 

83  89 

06  09 

59  36 

77  09 

83  87 

81  77 

93  77 

48  44 

88  30 

37  21 

74  02 

93  10 

05  85 

86  43 

25  50 

76  70 

36  32 

26  68 

54  92 

84  90 

02  38 

77  40 

13  46 

99  31 

30  29 

71  70 

91  10 

99  84 

55  31 

95  20 

90  28 

49  78 

56  27 

09  33 

66  79 

32  29 

50  54 

76  94 

27  01 

45  87 

29  66 

23  15 

5415 

62  11 

22  33 

39  39 

58  30 

73  43 

59  32 

26  43 

76  12 

99  10 

83  01 

86  58 

89  77 

68  87 

29  71 

49  50 

46  53 

56  53 

41  53 

52  20 

00  28 

17  33 

81  42 

24  33 

55  75 

42  70 

73  65 

16  96 

47  17 

42  69 

52  29 

68  59 

32  69 

40  30 

89  12 

11  07 

18  53 

2713 

46  54 

85  40 

64  43 

09  80 

68  29 

86  65 

60  27 

87  70 

77  45 

31  69 

12  31 

21  79 

80  68 

13  48 

80  84 

25  33 

70  89 

76  61 

03  41 

57  89 

87  07 

56  12 

28  72 

57  80 

54  05 

80  92 

82  65 

25  01 

74  58 

89  39 

25  05 

57  66 

33  48 

49  96 

00  17 

88  90 

63  67 

02  64 

71  12 

21  02 

29  86 

88  54 

04  41 

27  70 

10  49 

13  76 

99  38 

64  14 

90  60 

69  75 

10  97 

16  60 

21  31 

95  96 

89  48 

65  14 

12  02 

94  50 

35  64 

58  43 

92  07 

74  08 

52  08 

13  32 

36  45 

39  54 

82  26 

46  60 

04  19 

34  61 

36  12 

46  15 

90  57 

88  69 

61  05 

22  76 

90  79 

01  74 

22  08 

2613 

95  13 

75  53 

76  50 

49  80 

25  61 

81  96 

19  92 

33  14 

60  41 

27  06 

05  98 

51  49 

06  84 

76  10 

54  41 

54  56 

15  96 

49  19 

65  51 

93  32 

54  54 

95  67 

47  92 

60  37 

45  39 

67  64 

70  05 

06  54 

84  10 

88  68 

33  60 

77  81 

71  87 

9413 

64  75 

18  17 

76  80 

95  10 

33  33 

35  31 

30  47 

53  74 

38  30 

36  79 

74  83 

61  91 

56  22 

83  73 

15  54 

63  39 

50  33 

88  83 

09  80 

50  48 

23  26 

05  85 

68  97 

06  78 

00  17 

76  05 

95  31 

03  37 

82  52 

08  00 

33  76 

29  14 

18  59 

98  12 

89  34 

50  70 

13  07 

60  38 

1418 

02  28 

72  80 

85  72 

09  59 

05  26 

05  26 

90  65 

47  12 

85  65 

62  60 

63  74 

20  31 

60  66 

90  87 

09  41 

59  73 

60  00 

21  96 

38  40 

15  02 

56  81 

29  34 

90  99 

07  57 

80  24 

92  41 

88  41 

01  88 

05  62 

23  32 

03  76 

20  25 

96  68 

01  99 

79  82 

58  06 

89  54 

74  06 

01  39 

96  66 

81  45 

01  09 

18  35 

41  97 

70  37 

94  95 

48  64 

01  75 

04  39 

12  41 

98  35 

82  38 

49  91 

71  57 

83  06 

55  84 

38  04 

7018 

75  19 

70  78 

63  95 

94  82 

54  88 

47  69 

63  32 

79  75 

3156 

38  92 

54  43 

30  43 

70  43 

70  32 

73  47 

49  64 

23  54 

59  17 

80  48 

61  66 

45  66 

36  58 

96  32 

60  46 

60  87 

52  75 

53  13 

39  19 

41  52 

2414 

88  93 

17  35 

36  91 

90  59 

48  78 

99  31 

64  40 

84  05 

79  00 

53  03 

64  02 

73  30 

27  77 

44  50 

07  79 

27  66 

42  39 

97  64 

84  36 

18  13 

59  61 

92  15 

47  21 

82  54 

76  05 

5410 

40  93 

71  96 

66  52 

83  98 

17  85 

05  02 

28  36 

50  64 

47  21 

36  25 

80  01 

43  41 

36  58 

97  15 

29  95 

51  22 

04  71 

06  37 

3145 

69  62 

30  84 

20  28 

14  41 

70  05 

56  88 

23  28 

85  05 

96  40 

37  56 

52  60 

65  75 

21  47 

84  15 

99  92 

02  41 

Table  B Standardized  normal  distribution 
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X 

0.00 

0.01 

0.02 

0.03 

0.04 

0.05 

0.06 

0.07 

0.08 

0.09 

0, 

0, 

0, 

0, 

0, 

0, 

0, 

0, 

0, 

0, 

0.0 

5000 

5040 

5080 

5120 

5160 

5199 

5239 

5279 

5319 

5359 

0.1 

5398 

5438 

5478 

5517 

5557 

5596 

5636 

5675 

5714 

5753 

0.2 

5793 

5832 

5871 

5910 

5948 

5987 

6026 

6064 

6103 

6141 

0.3 

6179 

6217 

6255 

6293 

6331 

6368 

6406 

6443 

6480 

6517 

0.4 

6554 

6591 

6628 

6664 

6700 

6736 

6772 

6808 

6844 

6879 

0.5 

6915 

6950 

6985 

7019 

7054 

7088 

7123 

7157 

7190 

7224 

0.6 

7257 

7291 

7324 

7357 

7389 

7422 

7454 

7486 

7517 

7549 

0.7 

7580 

7611 

7642 

7673 

7704 

7734 

7764 

7794 

7823 

7852 

0.8 

7881 

7910 

7939 

7967 

7995 

8023 

8051 

8078 

8106 

8133 

0.9 

8159 

8186 

8212 

8238 

8264 

8289 

8315 

8340 

8365 

8389 

1.0 

8413 

8438 

8461 

8485 

8508 

8531 

8554 

8577 

8599 

8621 

1.1 

8643 

8665 

8686 

8708 

8729 

8749 

8770 

8790 

8810 

8830 

1.2 

8849 

8869 

8888 

8907 

8925 

8944 

8962 

8980 

8997 

9015 

1.3 

9032 

9049 

9066 

9082 

9099 

9115 

9131 

9147 

9162 

9177 

1.4 

9192 

9207 

9222 

9236 

9251 

9265 

9279 

9292 

9306 

9319 

1.5 

9332 

9345 

9357 

9370 

9382 

9394 

9406 

9418 

9429 

9441 

1.6 

9452 

9463 

9474 

9484 

9495 

9505 

9515 

9525 

9535 

9545 

1.7 

9554 

9564 

9573 

9582 

9591 

9599 

9608 

9616 

9616 

9633 

1.8 

9641 

9649 

9656 

9664 

9671 

9678 

9686 

9693 

9699 

9706 

1.9 

9713 

9719 

9726 

9732 

9738 

9744 

9750 

9756 

9761 

9767 

2.0 

9772 

9778 

9783 

9788 

9793 

9798 

9803 

9808 

9812 

9817 

2.1 

9821 

9826 

9830 

9834 

9838 

9842 

9846 

9850 

9854 

9857 

2.2 

9861 

9864 

9868 

9871 

9875 

9878 

9881 

9884 

9887 

9890 

2.3 

9893 

9896 

9898 

9901 

9904 

9906 

9909 

9911 

9913 

9916 

2.4 

9918 

9920 

9922 

9925 

9927 

9929 

9931 

9932 

9934 

9936 

2.5 

9938 

9940 

9941 

9943 

9945 

9946 

9948 

9949 

9951 

9952 

2.6 

9953 

9955 

9956 

9957 

9959 

9960 

9961 

9962 

9963 

9964 

2.7 

9965 

9966 

9967 

9968 

9969 

9970 

9971 

9972 

9973 

9974 

2.8 

9974 

9975 

9976 

9977 

9977 

9978 

9979 

9979 

9980 

9981 

2.9 

9981 

9982 

9982 

9983 

9984 

9984 

9985 

9985 

9986 

9986 

3.0 

9987 

9987 

9987 

9988 

9988 

9989 

9989 

9989 

9990 

9990 

3.1 

9990 

9991 

9991 

9991 

9992 

9992 

9992 

9992 

9993 

9993 

3.2 

9993 

9993 

9994 

9994 

9994 

9994 

9994 

9995 

3995 

9995 

3.3 

9995 

9995 

9995 

9996 

9996 

9996 

9996 

9996 

9996 

9997 

3.4 

9997 

9997 

9997 

9997 

9997 

9997 

9997 

9997 

9997 

9998 
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Table  C Student’s  t-distribution 


f 

a/2 

0.1 

0.05 

0.025 

0.01 

0.005 

1 

3.08 

6.31 

12.70 

31.80 

63.70 

2 

1.89 

2.92 

4.30 

6.96 

9.92 

3 

1.64 

2.35 

3.18 

4.54 

5.84 

4 

1.53 

2.13 

2.78 

3.75 

4.60 

5 

1.48 

2.01 

2.57 

3.36 

4.03 

6 

1.44 

1.94 

2.45 

3.14 

3.71 

7 

1.42 

1.89 

2.36 

3.00 

3.50 

8 

1,40 

1.86 

2.31 

2.90 

3.36 

9 

1.38 

1.83 

2.26 

2.82 

3.25 

10 

1.37 

1.81 

2.23 

2.76 

3.17 

11 

1.36 

1.80 

2.20 

2.72 

3.11 

12 

1.36 

1.78 

2.18 

2.68 

3.05 

13 

1.35 

1.77 

2.16 

2.65 

3.01 

14 

1.34 

1.76 

2.14 

2.62 

2.98 

15 

1.34 

1.75 

2.13 

2.60 

2.95 

16 

1.34 

1.75 

2.12 

2.58 

2.92 

17 

1.33 

1.74 

2.11 

2.57 

2.90 

18 

1.33 

1.73 

2.10 

2.55 

2.88 

19 

1.33 

1.73 

2.09 

2.54 

2.86 

20 

1.32 

1.72 

2.09 

2.53 

2.85 

21 

1.32 

1.72 

2.08 

2.52 

2.83 

22 

1.32 

1.72 

2.07 

2.51 

2.82 

23 

1.32 

1.71 

2.07 

2.50 

2.81 

24 

1.32 

1.71 

2.06 

2.49 

2.80 

25 

1.32 

1.71 

2.06 

2.48 

2.79 

26 

1.32 

1.71 

2.06 

2.48 

2.78 

27 

1.31 

1.70 

2.05 

2.47 

2.77 

28 

1.31 

1.70 

2.05 

2.47 

2.76 

29 

1.31 

1.70 

2.05 

2.46 

2.76 

30 

1.31 

1.70 

2.04 

2.46 

2.75 

40 

1.30 

1.68 

2.02 

2.42 

2.70 

60 

1.30 

1.67 

2.00 

2.39 

2.66 

120 

1.29 

1.66 

1.98 

2.36 

2.62 

oo 

1.28 

1.64 

1.96 

2.33 

2.58 

Table  D Chi-square  distribution 
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P 


f 

0.995 

0.99 

0.975 

0.95 

0.9 

0.75 

0.5 

0.25 

0.1 

0.05 

0.025 

0.01 

0.005 

0.001 

f 

1 

- 

- 

- 

- 

0.016 

0.102 

0.455 

1.32 

2.71 

3.84 

5.02 

6.63 

7.88 

10.8 

1 

2 

0.010 

0.020 

0.051 

0.103 

0.211 

0.575 

1.39 

2.77 

4.61 

5.99 

7.38 

9.21 

10.6 

13.8 

2 

3 

0.072 

0.115 

0.216 

0.352 

0.584 

1.21 

2.37 

4.11 

6.25 

7.81 

9.35 

11.3 

12.8 

16.3 

3 

4 

0.207 

0.297 

0.484 

0.711 

1.06 

1.92 

3.36 

5.39 

7.78 

9.49 

11.1 

13.3 

14.9 

18.5 

4 

5 

0.412 

0.554 

0.831 

1.15 

1.61 

2.67 

4.35 

6.63 

9.24 

11.1 

12.8 

15.1 

16.7 

20.5 

5 

6 

0.676 

0.872 

1.24 

1.64 

2.20 

3.45 

5.35 

7.84 

10.6 

12.6 

14.4 

16.8 

18.5 

22.5 

6 

7 

0.989 

1.24 

1.69 

2.17 

2.83 

4.25 

6.35 

9.04 

12.0 

14.1 

16.0 

18.5 

20.3 

24.3 

7 

8 

1.34 

1.65 

2.18 

2.73 

3.49 

5.07 

7.34 

10.2 

13.4 

15.5 

17.5 

20.1 

22.0 

26.1 

8 

9 

1.73 

2.09 

2.70 

3.33 

4.17 

5.90 

8.34 

11.4 

14.7 

16.9 

19.0 

21.7 

23.6 

27.9 

9 

10 

2.16 

2.56 

3.25 

3.94 

4.87 

6.74 

9.34 

12.5 

16.0 

18.3 

20.5 

23.2 

25.2 

29.6 

10 

11 

2.60 

3.05 

3.82 

4.57 

5.58 

7.58 

10.3 

13.7 

17.3 

19.7 

21.9 

24.7 

26.8 

31.3 

11 

12 

3.07 

3.57 

4.40 

5.23 

6.30 

8.44 

11.3 

14.8 

18.5 

21.0 

23.3 

26.2 

28.3 

32.9 

12 

13 

3.57 

4.11 

5.01 

5.89 

7.04 

9.30 

12.3 

16.0 

19.8 

22.4 

24.7 

27.7 

29.8 

34.5 

13 

14 

4.07 

4.66 

5.63 

6.57 

7.79 

10.2 

13.3 

17.1 

21.1 

23.7 

26.1 

29.1 

31.3 

36.1 

14 

15 

4.60 

5.23 

6.26 

7.26 

8.55 

11.0 

14.3 

18.2 

22.3 

25.0 

27.5 

30.6 

32.8 

37.7 

15 

16 

5.14 

5.81 

6.91 

7.96 

9.31 

11.9 

15.3 

19.4 

23.5 

26.3 

28.8 

32.0 

34.3 

39.3 

16 

17 

5.70 

6.41 

7.56 

8.67 

10.1 

12.8 

16.3 

20.5 

24.8 

27.6 

30.2 

33.4 

35.7 

40.8 

17 

18 

6.26 

7.01 

8.23 

9.39 

10.9 

13.7 

17.3 

21.6 

26.0 

28.9 

31.5 

34.8 

37.2 

42.3 

18 

19 

6.84 

7.63 

8.91 

10.1 

11.7 

14.6 

18.3 

22.7 

27.2 

30.1 

32.9 

36.2 

38.6 

43.8 

19 

20 

7.43 

8.26 

9.59 

10.9 

12.4 

15.5 

19.3 

23.8 

28.4 

31.4 

34.2 

37.6 

40.0 

45.3 

20 

21 

8.03 

8.90 

10.3 

11.6 

13.2 

16.3 

20.3 

24.9 

29.6 

32.7 

35.5 

38.9 

41.4 

46.8 

21 

22 

8.64 

9.54 

11.0 

12.3 

14.0 

17.2 

21.3 

26.0 

30.8 

33.9 

36.8 

40.3 

42.8 

48.3 

22 

23 

9.26 

10.2 

11.7 

13.1 

14.8 

18.1 

22.3 

27.1 

32.0 

35.2 

38.1 

41.6 

44.2 

49.7 

23 

24 

9.89 

10.9 

12.4 

13.8 

15.7 

19.0 

23.3 

28.2 

33.2 

36.4 

39.4 

43.0 

45.6 

51.2 

24 

25 

10.5 

11.5 

13.1 

14.6 

16.5 

19.9 

24.3 

29.3 

34.4 

37.7 

40.6 

44.3 

46.9 

52.6 

25 

26 

11.2 

12.2 

13.8 

15.4 

17.3 

20.8 

25.3 

30.4 

35.6 

38.9 

41.9 

45.6 

48.3 

54.1 

26 

27 

11.8 

12.9 

14.6 

16.2 

18.1 

21.7 

26.3 

31.5 

36.7 

40.1 

43.2 

47.0 

49.6 

55.5 

27 

28 

12.5 

13.6 

15.3 

16.9 

18.9 

22.7 

27.3 

32.6 

37.9 

41.3 

44.5 

48.3 

51.0 

56.9 

28 

29 

13.1 

14.3 

16.0 

17.7 

19.8 

23.6 

28.3 

33.7 

39.1 

42.6 

45.7 

49.6 

52.3 

58.3 

29 

30 

13.8 

15.0 

16.8 

18.5 

20.6 

24.5 

29.3 

34.8 

40.3 

43.8 

47.0 

50.9 

53.7 

59.7 

30 

Table  E F-distribution 
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Table  ] Laplace  function 
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0.4572 

0.4531 

0.4493 

0.4309 

0.4128 

0.4937 

0.4939 

0.4941 

0.4942 

0.4944 

0.4946 

0.4946 

0.4949 

0.4950 

0.4952 

0.4903 

0.4634 

0.4323 

0.4969 

0.4574 

0.4139 

0.4664 

0.4151 

0.4600 

0.4012 
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z 

0 

1 

2 

3 

4 

5 

6 

7 

8 

9 

2.6 

0.4953 

0.4954 

0.4956 

0.4957 

0.4958 

0.4959 

0.4960 

0.4962 

0.4963 

0.4964 

0.4388 

0.4729 

0.4035 

0.4308 

0.4547 

0.4754 

0.4930 

0.4074 

0.4189 

0.4274 

2.7 

0.4965 

0.4966 

0.4967 

0.4968 

0.4969 

0.4970 

0.4971 

0.4971 

0.4972 

0.4973 

0.4330 

0.4358 

0.4359 

0.4333 

0.4280 

0.4202 

0.4099 

0.4972 

0.4821 

0.4646 

2.8 

0.4974 

0.4975 

0.4975 

0.4976 

0.4977 

0.4978 

0.4978 

0.4980 

0.4980 

0.4980 

0.4449 

0.4229 

0.4988 

0.4726 

0.4443 

0.4140 

0.4818 

0.4476 

0.4116 

0.4738 

2.9 

0.4981 

0.4981 

0.4982 

0.4983 

0.4983 

0.4984 

0.4984 

0.4985 

0.4985 

0.4986 

0.4342 

0.4929 

0.4498 

0.4052 

0.4689 

0.4111 

0.4618 

0.4110 

0.4588 

0.4051 

3.0 

0.4986 

0.4986 

0.4987 

0.4987 

0.4988 

0.4988 

0.4988 

0.4989 

0.4989 

0.4989 

0.4501 

0.4938 

0.4361 

0.4772 

0.4171 

0.4558 

0.4933 

0.4297 

0.4650 

0.4992 

3.1 

0.4990 

0.4990 

0.4990 

0.4991 

0.4991 

0.4991 

0.4992 

0.4992 

0.4992 

0.4992 

0.4324 

0.4646 

0.4957 

0.4260 

0.4553 

0.4831 

0.4112 

0.4378 

0.4636 

0.4886 

3.2 

0.4993 

0.4993 

0.4993 

0.4993 

0.4994 

0.4994 

0.4994 

0.4994 

0.4994 

0.4994 

0.4129 

0.4363 

0.4590 

0.4810 

0.4024 

0.4230 

0.4429 

0.4623 

0.4810 

0.4991 

3.3 

0.4995 

0.4995 

0.4995 

0.4995 

0.4995 

0.4995 

0.4996 

0.4996 

0.4996 

0.4996 

0.4166 

0.4335 

0.4499 

0.4658 

0.4811 

0.4959 

0.4103 

0.4242 

0.4376 

0.4505 

3.4 

0.4996 

0.4996 

0.4996 

0.4996 

0.4997 

0.4997 

0.4997 

0.4997 

0.4997 

0.4997 

0.4631 

0.4752 

0.4689 

0.4982 

0.4091 

0.4197 

0.4299 

0.4398 

0.4493 

0.4585 

3.5 

0.4997 

0.4997 

0.4997 

0.4997 

0.4998 

0.4998 

0.4998 

0.4998 

0.4998 

0.4998 

0.4674 

0.4759 

0.4842 

0.4922 

0.4999 

0.4074 

0.4146 

0.4215 

0.4282 

0.4347 

3.6 

0.4998 

0.4998 

0.4998 

0.4998 

0.4998 

0.4998 

0.4998 

0.4998 

0.4998 

0.4998 

0.4409 

0.4469 

0.4527 

0.4583 

0.4637 

0.4689 

0.4739 

0.4787 

0.4834 

0.4879 

3.7 

0.4998 

0.4998 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4922 

0.4964 

0.4004 

0.4043 

0.4080 

0.4116 

0.4150 

0.4184 

0.4216 

0.4247 

3.8 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4274 

0.4305 

0.4333 

0.4359 

0.4385 

0.4409 

0.4433 

0.4456 

0.4478 

0.4499 

3.9 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4519 

0.4539 

0.4557 

0.4575 

0.4593 

0.4609 

0.4625 

0.4641 

0.4655 

0.4670 

4.0 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4683 

0.4696 

0.4709 

0.4721 

0.4733 

0.4744 

0.4755 

0.4765 

0.4775 

0.4784 

4.1 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4793 

0.4802 

0.4811 

0.4819 

0.4826 

0.4834 

0.4841 

0.4848 

0.4854 

0.4861 

4.2 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4867 

0.4872 

0.4878 

0.4883 

0.4888 

0.4893 

0.4898 

0.4902 

0.4907 

0.4911 

4.3 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4915 

0.4918 

0.4922 

0.4925 

0.4929 

0.4932 

0.4935 

0.4938 

0.4941 

0.4943 

4.4 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4946 

0.4948 

0.4951 

0.4953 

0.4955 

0.4957 

0.4959 

0.4961 

0.4963 

0.4964 

4.5 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4999 

0.4966 

0.4968 

0.4969 

0.4971 

0.4972 

0.4973 

0.4974 

0.4976 

0.4977 

0.4978 

5.0 

0.4999 

0.4997 
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